Monday, January 19, 2026

Over-Looking in Search-Augmented Massive Language Fashions


Search-augmented massive language fashions (LLMs) excel at knowledge-intensive duties by integrating exterior retrieval.
Nonetheless, they typically over-search – unnecessarily invoking search device even when it doesn’t enhance response high quality,
which ends up in computational inefficiency and hallucinations by incorporating irrelevant context. On this work, we conduct a
systematic analysis of over-searching throughout a number of dimensions, together with question sorts, mannequin classes, retrieval
situations, and multi-turn conversations. Our discovering reveals: (i) search typically improves reply accuracy on answerable
queries however harms abstention on unanswerable ones; (ii) over-searching is extra pronounced in advanced reasoning fashions
and deep analysis techniques, is exacerbated by noisy retrieval, and compounds throughout turns in multi-turn conversations; and
(iii) the composition of retrieved proof is essential, because the presence of unfavorable proof improves abstention. To quantify
over-searching, we introduce Tokens Per Correctness (TPC), an analysis metric that captures the performance-cost
trade-off for search-augmented LLMs. Lastly, we examine mitigation approaches at each the question and retrieval ranges
and launch the OverSearchQA benchmark to foster continued analysis into environment friendly search-augmented LLMs.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles