RAG

Retrieval-Augmented Generation: a pattern that retrieves relevant documents via embeddings and feeds them to an LLM as context for generation.

RAG combines the knowledge retrieval of search systems with the generation capabilities of large language models. Documents are embedded into vectors, stored in a vector database, and retrieved based on query similarity. The retrieved context helps the LLM produce more accurate, grounded responses.

Also known as

Retrieval-Augmented Generation, retrieval augmented generation