GPT-3 has blown up again because of the chat application that OpenAI released. Whenever I’m dealing with a machine learning model, especially models like these large language models, I find it helpful to have heuristics that help me remember how the model is going to fail (and they all fail in some situations).
These heuristics need to be informed by the reality of how these models are built and trained, as well as by the real-world observed behavior of the models.
My heuristic for GPT-3 is that it’s something like an embodiment of the Zeitgeist if the Zeitgeist had bad reading comprehension skills and therefore misinterpreted a bunch of facts about the world.
This captures some of the primary failure modes of models like GPT:
- They are trained on an extremely large corpus, but not a good corpus. They will express the best of humanity and the worst, but mostly they express the median, which really impressive for a computer, but is generally pretty crap for a person both in form and content.
- Specific to content, these models will both explicitly and implicitly express common misperceptions and biases that appear regularly in the corpus. For instance, that people with lighter skin are better than people with darker skin. Or that people with physical disabilities are intellectually and morally lesser.
- They don’t have good reading comprehension skills. In human terms, they misinterpret a lot of the information that they consume. In some cases, the sheer volume of information corrects for this problem, but in most cases it will only exacerbate the problem.
Perhaps the biggest issue is that GPT attempts to mask some of its most egregious shortcomings (racism, ablism, sexism) by putting artificial barriers in place so that it doesn’t respond to prompts along these lines. This hides the immediate problem. But because of the way that these deep learning models work, these hidden biases necessarily infiltrate every answer and interaction with the model.
Kinda like a person.