Chunking

The process of splitting documents into smaller segments for embedding and retrieval, where chunk size and overlap determine what information retrieval can find.

Chunking is a critical preprocessing step in RAG systems. Fixed-size chunking (typically 512 tokens with 50-100 token overlap) provides a simple baseline but ignores document structure. Semantic chunking uses natural language boundaries to create coherent chunks, achieving significantly higher faithfulness scores. Agentic chunking adds intelligence to prevent mid-table and mid-sentence splits. The core tension: smaller chunks match queries precisely but lose context; larger chunks preserve relationships but dilute the relevance signal.

Also known as