The pink elephant in the room

What can we do about AI's paradox?

Jul 16, 2024

Originally published January 16 2024

You may not have heard of Epimenides of Crete. Even my brother, the philosophy professor, only vaguely recalled the name when I mentioned him in a recent conversation. Epimenides is mostly remembered for one thing—coining the “Liar’s Paradox,” a philosophical construction sometimes used in introductory logic classes. It’s often formulated as the statement “This sentence is a lie.” Obviously if the sentence is true, then it must be false, and vice versa—the paradox is clear. Before dismissing it as trivial, bear in mind that it was part of the inspiration for Gödel’s Incompleteness Theorem, published in 1931 and arguably the most important work in contemporary mathematics—demonstrating that there are true statements in mathematics which can never be proven.

Epimenides’ version of the Liar’s Paradox is a bit different—the statement “All Cretans are liars” is attributed to him. This could be paradoxical, given that Epimenides was from Crete, but could also be resolved if you define a liar as being someone who occasionally lies but sometimes tells the truth. But in fact, he may not even have said it like that—all we really have is a snippet of poetry containing the line “Cretans, always liars”—and, worse yet, Epimenides may or may not have even existed at all. He might have been made up—Wikipedia refers to him as a semi-mythical 6^th or 7^th century BC Greek seer. Fabulous stories are told about him—that he fell asleep in a cave for 57 years and woke up with the gift of prophecy, for example, and that he lived for some 300 years.

Epimenides and the Liar’s Paradox give us an interesting lens through which to look at the big issue that needs to be addressed with generative AI. The notions of truth and falsehood, provability and unprovability, and completely made-up stories all apply and shed light on how we should view the results when we work with AI systems.

Here and there and everywhere

A disclaimer when you log in to OpenAI’s ChatGPT, although the wording has recently changed, used to read as follows: “While we have safeguards in place, the system may occasionally generate incorrect or misleading information and produce offensive or biased content. It is not intended to give advice.” Some users of ChatGPT and other generative AI tools would have done well to heed this warning.

Why? Well, in mid-2023, there was a case involving a passenger who sued the South American airline Avianca for an injury sustained in-flight. The plaintiff’s lawyer cited some precedent cases that were unfamiliar both to the judge and the defense lawyers. Upon further investigation, it turned out that the plaintiff’s lawyer had used ChatGPT for his research and the cases he found never happened—they were made up by the AI system.

In another example, I recently read an article where the author asked ChatGPT what the world record was for crossing the English Channel on foot. Instead of answering that this was impossible, the system composed a two-paragraph response claiming that it was accomplished by a German named Christof Wandratsch on August 14, 2020, in 14 hours and 51 minutes. It went on to warn that crossing the Channel on foot is a very risky undertaking due to the cold and rough water.

These kinds of issues have come up so frequently that we now have a name for them: “AI Hallucinations.” It’s an appropriate term—hallucinations, like dreams and myths, start with a small kernel of truth but quickly veer into something completely fantastical. Epimenides may have existed, but he probably couldn’t foretell the future and he certainly didn’t live for three centuries. Legal precedent is a legitimate concept, but made-up cases are not. Christof Wandratsch, it turns out, does exist, but he did not walk across the English Channel. Elephants are real—but if pink elephants are parading through your room they can only be in your imagination.

To understand how AI hallucinations happen, let’s take another look at how inaccuracies and bias creep into the results from generative AI. Generative AI is trained on pre-existing data, with broad commercial systems scooping up as much content as possible from the whole internet. When prompted, the AI system composes a response by combining and reproducing elements from its training data. From this we can draw two conclusions. First, that generative AI does not create original content, although it might be able to represent existing data in new, possibly interesting ways. And second, that if the input data is unreliable, the output can also be unreliable.

The unusual view

But that’s only the beginning. It’s one thing for generative AI to produce an occasional error of fact in its output. It’s quite another to make up a completely fictitious legal document or any other report. So, how does that happen? The key is in the term generative. The system starts to generate a response to a prompt by searching for some key data from its training database. It then uses a probabilistic algorithm to produce coherent-sounding text by trying to predict the best next words based on previous words in the conversation and known patterns in the language—much like the type-ahead tool in your word processor or e-mail application, only far more sophisticated. If the user issues multiple prompts, the whole conversational context is used for this prediction.

With fewer user prompts, more of the AI system’s own words and its own knowledge of linguistic patterns will be used for future predictions—and this can then have a multiplier effect on any inaccuracy in the original data, so that eventually the whole output might be, well, hallucinated. For example, let’s look again at the hallucination of Christof Wandratsch walking across the English Channel. I mentioned that he does exist—he is an endurance swimmer who, in 2005, set a world record for swimming across the Channel in seven hours and three minutes. The date of the hallucinated crossing, August 14, 2020, is significant. A quick Google search yields many media articles covering an unusual influx of migrants trying to cross the Channel in boats in August of 2020 and at least three of these articles were dated August 14. And that very specific time of 14 hours, 51 minutes? Well, I found online a list of successful channel swims sorted by time, and the only time to appear more than once in that list is 14 hours and 51 minutes. Clearly, ChatGPT picked several data points related to Channel crossings from its training data but combined them in a nonsensical way, trying to tell the questioner what it ‘thought’ they wanted to hear.

This all leads me to a third conclusion, and a variation on the Liar’s Paradox, which I’ll call AI’s Paradox: It’s not that generative AI is both true and false at the same time, but—as far as generative AI can be considered accurate, it is not original; and the more it appears to be original, the more likely it is to be inaccurate.

Perhaps my favourite Dutch/Danish polymath, Piet Hein, put it more eloquently. He must have anticipated generative AI more than 50 years ago when he wrote the following verse:

Original thought
is a straightforward process.
It’s easy enough
when you know what to do.
You simply combine
in appropriate doses
the blatantly false
and the patently true.[1]

Now that we understand the problem, how do we fix it? As I’ve said before about AI, there’s no single easy solution. Quality and quantity of data, human intervention and governance will all have a role to play. Let’s look at each of these in turn.

What’ll we do, what’ll we do?

Some in the industry have argued that larger training data sets can help manage the problems of inaccurate data, and I echoed the same thoughts in an earlier blog post. I’m no longer quite so sure. We’ve only had about a year so far to analyze large-scale commercial generative AI, but there is some evidence that these systems are, in a way, getting dumber—introducing more errors in their output than before. The phenomenon is called “AI drift,” and it happens when the underlying data changes or grows but the large language model rules haven’t adapted. As generative AI ingests more and more data from the internet, it will need to be continually tuned to properly use it.

In a recent post, I suggested that smaller generative AI systems built for specific purposes, and trained on smaller, limited domains of data, could help alleviate the risks of plagiarism and copyright violation. I think the same holds true for AI hallucinations. When you control and limit the data used to train a generative AI tool for a specific purpose, you can manage the accuracy and mitigate bias. IBM is in fact selling its foundation models—curated training data for several specific domains including coding, legal and finance—as part of its AI platform, Watsonx. It’s confident enough in this approach that it builds IP indemnity protection into its licensing contracts.

So, smaller, curated data sets are the first step. Next, there must be a role for human intervention. Reinforcement learning is a common machine learning technique, where good decisions by the AI algorithm are rewarded and weighted more heavily in subsequent iterations. Some generative AI systems are now implementing Reinforcement Learning from Human Feedback (RLHF), in which people evaluate, correct and resubmit the AI system’s output so that it can, essentially, learn from its mistakes and generate more accurate results the next time. RLHF is fact-checking done by paid professionals. It’s a good approach but I don’t think it absolves the end user from the responsibility of fact-checking their own unique queries and outputs.

Finally, governance must play a role and can do so in a couple of ways. One is to have the AI system fact-check itself. This concept is called Retrieval-Augmented Generation, or RAG. The idea is that, rather than rely only on its training data, the AI system can also be directed to external sources to provide inputs to its answer, kind of like you would if you went to Google to find more information to augment your own knowledge of a particular topic. I wonder what kind of performance overhead this might introduce, and I still wouldn’t 100 per cent trust the information I get from those external sources, but I do think RAG can help. For smaller, limited foundation models, companies could introduce RAG to point their AI systems to other corporate databases and information repositories.

Another area of AI governance is provability of the results. Luckily, generative AI is still not nearly as complex a field of study as all of mathematics, so Gödel’s Incompleteness Theorem doesn’t apply—we don’t have to live with unprovable statements. It may be difficult to implement, but AI systems can and should have the capability to trace how any response was generated, right back to the original pieces of information plucked from the training data set. Whatever you call it—traceability, transparency or provability—it will give users more confidence in how to handle the results and give developers better ways to improve the system.

It won’t be easy, but with better data, human intervention and good governance, we can deal with the elephant in the room—in pink or whatever colour you choose to see it.

[1] Hein, Piet. “Originality.” Grooks 3. Translated by Jens Arup. Doubleday & Company, 1970.

Emergent technology

Discussion about this post