Media release
From:
1. Artificial intelligence: Detecting hallucinations in large language models (N&V)
A method for detecting hallucinations in large language models (LLMs) that measures uncertainty in the meaning of generated responses is presented in Nature this week. The approach could be used to improve the reliability of LLM output.
LLMs, such as ChatGPT and Gemini, are artificial intelligence systems that can read and generate natural human language. However, such systems can be prone to hallucinations, in which generated content is either inaccurate or nonsensical. Detecting the extent to which an LLM may hallucinate is challenging as the responses may seem plausible in the way that they are presented.
Sebastian Farquhar and colleagues attempt to quantify the degree of hallucinations generated by an LLM and thus determine how true to the provided source content generated content might be. Their method detects a specific subclass of hallucinations called confabulations, which are inaccurate and arbitrary and often occur when there is a lack of knowledge in the LLM. The approach considers the nuance of language and how responses can be expressed in different ways, which may have different meanings. The authors show that their method can detect confabulations in LLM-generated biographies and in answers to questions on topics such as trivia, general knowledge and life sciences.
The task is performed by an LLM and is evaluated by a third LLM, which amounts to “fighting fire with fire,” notes Karin Verspoor in an accompanying News & Views article. She adds “using an LLM to evaluate an LLM-based method does seem circular, and might be biased.” However, the authors propose that their method may help users to understand when they should take care in relying on LLM responses and may mean that LLMs could be used with more confidence in a broader range of applications.