Reranking
A two-stage retrieval pattern where fast initial search retrieves candidates, then more expensive models like cross-encoders re-score them by actual query relevance.
Reranking improves retrieval precision by using computationally expensive models on a shortlist of candidates rather than the entire index. Cross-encoders like MiniLM score query-passage pairs directly for high accuracy. Late interaction models like ColBERT encode separately but compute token-level similarity for better efficiency. Both approaches significantly improve NDCG and MRR metrics over single-stage retrieval.
Also known as
re-ranking, cross-encoder reranking, two-stage retrieval