AI Hallucinations: New Detection Method Unveiled
In recent developments, so-called "hallucinations" by AI tools, where large language models (LLMs) like ChatGPT or Gemini fabricate plausible-sounding but imaginary facts, have been a major obstacle to their broader adoption. These hallucinations can render LLMs unreliable, with notable incidents including a US lawyer facing legal issues after citing a non-existent case generated by ChatGPT. Such errors can also pose significant risks in fields like medical diagnosis.
A study published in Nature by Oxford researchers introduces a novel method to detect when an LLM is likely to hallucinate, potentially enabling safer deployment of these models in critical areas like legal and medical question-answering. The researchers targeted hallucinations where LLMs provide different answers to the same question, a phenomenon known as confabulation. Dr. Sebastian Farquhar, a study author, emphasized that their method differentiates between a model’s uncertainty about the content of an answer and its uncertainty about how to articulate it, an advancement over previous approaches.