← Back to Blogs
Skip to main content

Using a 7B Model + RAG to Identify and Edit Word-level Hallucinations

· 2 min read
Zain Hasan

A preview of the paper

Using a 7B Model + RAG to Identify and Edit Word-level Hallucinations in LLMs better then GPT-4:

In Short⏩:

Train a model that consists of a Retreiver and a Language Model:

>> The retriever, Mret, takes the original output you want to check to hallucination (y) and optionally input prompt (x) and retrieves top relevant documents (C). So C = Mret(x, y). This can be a vector database like Weaviate for example.

>> The detector and editor, Medit, takes in the context - (C), input - (x) and output - (y) and detects (and if possible also edits/corrects) factual errors in (y) given the retrieved context (C): y* = Medit(x, y, C).

🏋️Training:

Create a synthetic hallucination dataset of 35k C = context, y=incorrect output, y=annotated fixed output -> (C, y, y)

Magic Synthetic Dataset Creation:

>> GPT-4 is few-shot prompted to add different types of errors to a passage

>> It is also instructed to mark phrases or sentences for deletion along with their error type and insert phrases and sentences along with insertion tags

Start off with Llama2-Chat 7B to initialize Medit and then train on (C, y, y∗)

Medit takes in (C, y) as input and learns to predict the edited outputs with tags to represent error type y∗ using standard language modeling objective.

The model, once trained, can identify different types of hallucination and mark which words they come from - it also suggests edits to improve factuality.

Result 📈:

The model has a fine-grained hallucination detection accuracy 46.5% while it's binary acc.{hallucination, no hallucination} is 79%.

For comparison ChatGPT has a fine-grained hallucination detection acc. of 21.5% (59% binary acc) w/o RAG and 26%(68.5% binary hall detect) w/ RAG

💻Code

🔷Data

🏗️Model

🔗 arXiv Link

📜 Download paper

Ready to start building?

Check out the Quickstart tutorial, or build amazing apps with a free trial of Weaviate Cloud (WCD).

Don't want to miss another blog post?

Sign up for our bi-weekly newsletter to stay updated!


By submitting, I agree to the Terms of Service and Privacy Policy.