Retrieval-Augmented Generation (RAG)

A technique for improving the quality of large language model (LLM) output by grounding it with sources of knowledge retrieved after the model was trained. RAG improves the accuracy of LLM responses by providing the trained LLM with access to information retrieved from trusted knowledge bases or documents.

Common motivations to use retrieval-augmented generation include:

Increasing the factual accuracy of a model's generated responses.

Giving the model access to knowledge it was not trained on.

Changing the knowledge that the model uses.

Enabling the model to cite sources.

For example, suppose that a chemistry app uses the PaLM API to generate summaries related to user queries. When the app's backend receives a query, the backend:

Searches for ("retrieves") data that's relevant to the user's query.

Appends ("augments") the relevant chemistry data to the user's query.

Instructs the LLM to create a summary based on the appended data.

Examples

1.
For example, suppose that a chemistry app uses the PaLM API to generate summaries related to user queries. When the app's backend receives a query, the backend:
2.
Searches for ("retrieves") data that's relevant to the user's query.
3.
Appends ("augments") the relevant chemistry data to the user's query.

Real-world uses

Created for this library

1.
An enterprise legal team uses retrieval-augmented generation to ground every answer in clauses from its own contract library.
2.
A healthcare provider uses retrieval-augmented generation so its policy assistant answers based on the latest approved documents.
3.
A consulting firm uses retrieval-augmented generation to ground its research assistant in client-specific knowledge before answering.

Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License

Back to glossary

Common motivations to use retrieval-augmented generation include:

Increasing the factual accuracy of a model's generated responses.

Giving the model access to knowledge it was not trained on.

Changing the knowledge that the model uses.

Enabling the model to cite sources.

For example, suppose that a chemistry app uses the PaLM API to generate summaries related to user queries. When the app's backend receives a query, the backend:

Searches for ("retrieves") data that's relevant to the user's query.

Appends ("augments") the relevant chemistry data to the user's query.

Instructs the LLM to create a summary based on the appended data.

Examples

1.
For example, suppose that a chemistry app uses the PaLM API to generate summaries related to user queries. When the app's backend receives a query, the backend:
2.
Searches for ("retrieves") data that's relevant to the user's query.
3.
Appends ("augments") the relevant chemistry data to the user's query.

Real-world uses

Created for this library

1.
An enterprise legal team uses retrieval-augmented generation to ground every answer in clauses from its own contract library.
2.
A healthcare provider uses retrieval-augmented generation so its policy assistant answers based on the latest approved documents.
3.
A consulting firm uses retrieval-augmented generation to ground its research assistant in client-specific knowledge before answering.

Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License

Back to glossary

Examples

Real-world uses

Loading…

Examples

Real-world uses