logo
P
Prompt Master

Prompt 大师

掌握和 AI 对话的艺术

RAG

Retrieval-Augmented Generation: improve factuality with retrieval

General-purpose language models can handle common tasks like sentiment analysis and named entity recognition with fine-tuning. These tasks don't need extra background knowledge.

But for more complex, knowledge-intensive tasks, you can build a system on top of a language model that accesses external knowledge sources. This makes outputs more factually consistent, more reliable, and helps reduce hallucinations.

Meta AI researchers introduced Retrieval Augmented Generation (RAG) for exactly these kinds of tasks. RAG combines an information retrieval component with a text generation model. It can be fine-tuned, and its internal knowledge can be updated efficiently without retraining the whole model.

RAG takes an input, retrieves a set of relevant/supporting documents (from a source like Wikipedia), and combines those documents as context with the original prompt before feeding everything to the text generator. This makes RAG much better at handling facts that change over time. And that matters -- LLM parametric knowledge is static. RAG lets the model access the latest information without retraining, producing reliable outputs based on retrieval.

Lewis et al. (2021) proposed a general RAG fine-tuning recipe. It uses a pre-trained seq2seq model as parametric memory and a dense vector index of Wikipedia as non-parametric memory (accessed via a neural pre-trained retriever). Here's how it works:

RAG

Image source: Lewis et al. (2021)

RAG performed strongly on benchmarks like Natural Questions, WebQuestions, and CuratedTrec. On MS-MARCO and Jeopardy questions, RAG generated answers that were more factual, more specific, and more diverse. FEVER fact verification also improved with RAG.

This shows RAG is a viable approach for boosting language model output on knowledge-intensive tasks.

Recently, retriever-based methods have become increasingly popular, often combined with popular LLMs like ChatGPT to improve their capabilities and factual consistency.

You can find a simple example of using a retriever and LLM for question answering with source citations in the LangChain docs.