What are artificial intelligence (AI) hallucinations?

AI hallucinations are incorrect or false responses given by generative AI models.

Learning Objectives

After reading this article you will be able to:

  • Define, and provide examples of, AI hallucinations
  • Describe some of the causes of AI hallucinations
  • Outline steps for preventing AI hallucinations

Copy article link

What are artificial intelligence (AI) hallucinations?

Artificial intelligence (AI) hallucinations are falsehoods or inaccuracies in the output of a generative AI model. Often these errors are hidden within content that appears logical or is otherwise correct. As usage of generative AI and large language models (LLMs) has become more widespread, many cases of AI hallucinations have been observed.

The term "hallucination" is metaphorical — AI models do not actually suffer from delusions as a mentally unwell human might. Instead they produce unexpected outputs that do not correspond to reality in response to prompts. They may misidentify patterns, misunderstand context, or draw from limited or biased data to get those unexpected outputs.

Some documented examples of AI hallucinations include:

  • An AI model was prompted to write about Tesla's quarterly results and produced a coherent article but with false financial information
  • A lawyer used an LLM to produce supporting material in a legal case, but the LLM generated references to other legal cases that did not exist
  • Google's Gemini image generation tool regularly produced historically inaccurate images for a period of time in 2024

While AI has a number of use cases and real-world applications, in many cases, AI models' tendency to hallucinate means they cannot be entirely relied upon without human oversight.

How does generative AI work?

All AI models are made up of a combination of training data and an algorithm. An algorithm, in the context of AI, is a set of rules that lay out how a computer program should weight or value certain attributes. AI algorithms contain billions of parameters — the rules on how attributes should be valued.

Generative AI needs training data because it learns by being fed millions (or billions, or trillions) of examples. From these examples, generative AI models learn to identify relationships between items in a data set — typically by using vector databases that store data as vectors, enabling the models to quantify and measure the relationships between data items. (A "vector" is a numerical representation of different data types, including non-mathematical types like words or images.)

Once the model has been trained, it keeps refining its outputs based on the prompts it receives. Its developers will also fine-tune the model for more specific uses, continuing to change the parameters of the algorithm, or using methods like low-rank adaptation (LoRA) to quickly adjust the model to a new use.

Put together, the result is a model that can respond to prompts from humans by generating text or images based on the samples it has seen.

However, human prompts can vary greatly in complexity and cause unexpected behavior by the model, since it is impossible to prepare it for every possible prompt. And, the model may misunderstand or misinterpret the relationships between concepts and items even after extensive training and fine-tuning. Unexpected prompts and misperceptions of patterns can lead to AI hallucinations.

What causes AI to hallucinate?

Sources of training data: It is hard to vet training data because AI models need so much that a human cannot review all of it. Unreviewed training data may be incorrect or weighted too heavily in a certain direction. Imagine an AI model that is trained to write greeting cards, but its training data set ends up containing mostly birthday cards, unbeknownst to its developers. As a result, it might generate happy or funny messages in inappropriate contexts, such as when prompted to write a "Get well soon" card.

Inherent limits of generative AI design: AI models use probability to "predict" which words or visual elements are likely to appear together. Statistical analysis can help a computer create plausible-seeming content — content that has a high probability of being understood by humans. But statistical analysis is a mathematical process that may miss some of the nuances of language and meaning, resulting in hallucinations.

Lack of direct experience of the physical world: Today's AI programs are not able to detect whether something is "true" or "false" in an external reality. While a human could, for example, conduct experiments to determine if a scientific principle is true or false, AI currently can only train itself on preexisting content, not directly on the physical universe. It therefore struggles to tell the difference between accurate and inaccurate data, especially in its own responses.

Struggle to understand context: AI only looks at literal data and may not understand cultural or emotional context, leading to irrelevant responses and AI hallucinations. Satire, for example, may confuse AI (even humans often confuse satire with fact).

Bias: The training data used may lead to built-in bias if the data set is not broad enough. Bias can simply skew AI models towards giving certain kinds of answers, or it can even lead to promoting racial or gender stereotypes.

Attacks on the model: Malicious persons can use prompt injection attacks to alter the way generative AI models perceive prompts and produce results. A highly public example occurred in 2016, when Microsoft launched a chatbot, Tay, that within a day started generating racist and sexist content due to Twitter (now X) users feeding it information that distorted its responses. AI models have become more sophisticated since then but are still vulnerable to such attacks.

Overfitting: If an AI model is trained too much on its initial training data set, it can lose the ability to generalize, detect trends, or draw accurate conclusions from new data. It may also detect patterns in its training data that are not actually significant, leading to errors that are not apparent until it is fed new data. These scenarios are called "overfitting": the models fit too closely with their training data. As an example of overfitting, during the COVID-19 pandemic, AI models trained on scans of COVID patients in hospitals started picking up on the text font that the different hospitals used, and treating the font as a predictor of COVID diagnosis. For generative AI models, overfitting can lead to hallucinations.

How can AI developers prevent AI hallucinations?

While developers may not be able to eliminate AI hallucinations completely, there are concrete steps they can take to make hallucinations and other inaccuracies less likely.

  • More data and better data: Large data sets from a range of sources can help eliminate bias and help models learn to detect trends and patterns in a wider variety of data.
  • Avoid overfitting: Developers should try not to train an AI model too much on one data set.
  • Extensive testing: AI models should be tested in a range of contexts and with unexpected prompts.
  • Using models designed for the use case: An LLM chatbot, for instance, may not be well-suited for answering factual queries about medical research.
  • Continued refinement: Even the most fine-tuned model is likely to have blind spots. AI models should continue to learn from the prompts they receive (with validation in place to help prevent prompt injection attacks).
  • Putting up guardrails for generative AI chatbots: A retrieval augmented generation (RAG) chatbot that has access to company-specific data to enhance responses could still hallucinate. Developers can implement guardrails, such as instructing the chatbot to return "I do not have enough information to answer that" when it cannot find the answer, instead of making one up.

Learn how Cloudflare for AI helps developers build and run AI models from anywhere in the world. And discover how Cloudflare Vectorize enables developers to generate and store embeddings in a globally distributed vector database.