Back to blog

Everything You Need to Know About Model Hallucinations

If you’ve worked with LLMs at all, you’ve probably heard the term model hallucinations tossed around. So what does it mean? Is your model ingesting psychedelic substances? Or are you the crazy one and hallucinating a model that doesn’t actually exist? Luckily, the cultural parlance points to a problem that is less serious than it sounds. However, model hallucinations are something that every LLM user will encounter, and they can cause problems for your AI-based systems if not properly dealt with. Read on to learn about what model hallucinations are, how you can detect them, and steps you can take to remediate them when they inevitably do arise.

ai hallucination

 

What’s in a LLM Hallucination?

No, your LLM didn’t take a trip with Alice into Wonderland, although perhaps it will tell you that it did. A model hallucination is roughly that – the tendency of a large language model to sometimes make up facts, invent unprompted fictions, and produce confident responses that belie an underlying falsehood. If you were to dig into the research underlying LLMs, you’d find that they’re simply trained to predict the next word in a sentence. Given massive amounts of training data, they learn to do this exceedingly well, and this can sometimes give the impression that they’re intelligent agents operating with a factual understanding of the world. While they do have access to internet-scale knowledge, their training objective of predicting the next word in a sentence is such that their main objective is to be conversational, rather than strictly knowledgeable. LLMs do at times hallucinate, and often when you’d least expect.

Out of curiosity, I asked ChatGPT whether it’s ever been known to hallucinate.

gpt hallucinate

 

As you can see, it replied in the negative. Good news, folks! AGI is not here yet. 

That said, as you can imagine, model hallucinations are not something you want when using an LLM in a business context. Imagine an LLM that hallucinates the amount of money that’s in a customer’s bank account, does the math wrong when filling out a customer’s tax forms, or misdiagnoses a patient presenting with a certain set of symptoms. Each of these could prove disastrous in their own way, so model hallucinations currently present one of the biggest barriers to LLMs being used in crucial business contexts.

 

Detecting and Mitigating AI Hallucinations

Detecting hallucinations can be tricky. Many LLM APIs return the predicted dialogue but do not return any confidence scores associated with their answers. This means that LLMs will sometimes answer questions confidently, even if they don’t actually know the answer. There is a growing body of research work on detecting model hallucinations; see, for example, here, here, and here. However, this is still a nascent area for which the proposed methods are highly experimental and still in their infancy. Unfortunately, short of creating a brand new LLM architecture, it’s also impossible to prevent model hallucinations before they’re generated.

One way to limit model hallucinations is to provide LLMs with additional context via prompt engineering techniques. For example, a bank using an LLM to facilitate interactions with customers might want to give the LLM access to its internal accounts API so that it can retrieve customer account data. However, this approach can sometimes backfire as well. The LLM could, potentially, access another customer’s data, perhaps a second customer with the same name, and return it to the wrong user! This introduces a number of privacy concerns.

Another way to combat model hallucinations is to turn down the “temperature” parameter associated with LLM generation. This is possible in the API interface to many LLMs such as GPT-3. This will cause the model to generate more predictable responses in line with its training data, though at the expense of creativity. This might be desirable when using LLMs within a well-regulated business context.

Another sensible approach is to monitor LLM usage at a granular level. While you won’t be able to prevent hallucinations, or perhaps not even detect them right when they arise, you’ll likely be attuned to the downstream effects of hallucinations and be able to trace them back to the source. Returning to our bank example, if an LLM provides a customer with the incorrect amount of money in their bank account, you could detect subsequent attempts by that customer to overdraw their account and quickly work to remedy the issue when its effect is still negligible.

Finally, you’ll want to monitor for distribution shit and model drift while periodically fine-tuning or retraining the model as necessary. This will make the model less likely to generate responses that are no longer aligned with its training data.

 

customer service

 

Should Businesses Use LLMs?

Ultimately, this is a question of risk tolerance, and the answer will vary from industry to industry, business to business, and even use-case to use-case. For many industries, particularly those that are not highly regulated and more creative in nature such as marketing, customer support, publishing, and graphic design, LLMs are more than ready to assist in day-to-day tasks and can serve as a great brainstorming or content generation tool. In more creative careers, model hallucinations can even sometimes be desirable as they indicate that the model is prioritizing creativity over rigidly factual responses.

For highly regulated industries with extreme potential downsides, users should proceed with caution. When using LLMs in a medical context, for example, it’s of paramount importance to have proper monitoring procedures and safeguards in place that can detect aberrations in model performance and alert developers to the issue.

The interesting cases lie in the gray area between these two extremes. Industries such as banking, accounting, IT, etc. have a lot to gain from the use of LLMs. They can cut down on wasted work, eliminate drudgery for employees, and save businesses boatloads of money. However, they can also create harm. They might lead to poorer customer service or loss of revenue if not used carefully. For businesses in these sectors, it is up to each individual company to weigh the potential costs and benefits of incorporating LLMs into their processes. However, for many organizations, proper monitoring can tilt the scales strongly towards a net benefit arising from the use of large language models and machine learning more generally.

Ultimately, if you’re using LLMs in your application, you’ll want to make sure they’re ready for the public. The best way to mitigate risk is to use Mona's free solution to track your application’s behavior and be alerted to anomalies in your LLMs. This solution gives you all the tools you need to assess downstream impacts of LLM outputs, monitor model usage at a granular level, and detect model hallucinations before they lead to huge problems. Mona’s free GPT monitoring solution allows you to have confidence that your GPT-based application is performing as expected and gives you the tools to address any issues that might arise.