How to Know if AI is Hallucinating

This is always going to be a grey area with LLM conversations. But there is, in general, a good way to gauge the degree to which the LLM is composing or hallucinating versus analyzing and interpreting. This is to look at the ratio of output to input.

When the input contains very little information, it is necessarily vague and invites the LLM to hallucinate. For example, if I ask it to tell me a bedtime story, the output will be very unpredictable and composed entirely of hallucination, within the single parameter of “bedtime story”.

If I ask for a parable in the style of Aesop, but with a moral lesson concerning the abdication of personal responsibility – this is a much longer prompt with several dimensions of restriction. The outcome will likely follow a highly predictable pattern, but will still include many hallucinated details and flourishes. The length of the story is unlikely to be greater than in the first case, so our ratio has already narrowed.

If I now upload a fairytale by H. C. Andersen and request an analysis of the narrative structure, my output is likely to be smaller than the input. It will also be very tightly focused on its assigned task, and although it may still hallucinate – for example, it may invent a fictitious historical analog or scholarly reference – the focus is still on applying the requested processing to the input provided.

The ratio of length between the input and the output is not a solid measurement of the degree of LLM hallucination. But it can provide a general rule of thumb that enhances our perception of the conversation we are having, and informs the decisions we make about how to proceed.