What type of study is this?

This is a Quantitative Study study.

October 2, 2025Open Access

KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

Key Points

This method enhances example selection in in-context learning by improving prediction accuracy for specific queries.
Key evidence shows significant performance gains across classification tasks compared to standard retrieval techniques.
The approach utilizes an information theory-driven model, optimizing example selection through a greedy algorithm.
This research highlights the importance of structured and diverse example selection in scenarios with limited labeled data.

Abstract

In-context learning (ICL) has emerged as a powerful paradigm for adapting large language models (LLMs) to new and data-scarce tasks using only a few carefully selected task-specific examples presented in the prompt. However, given the limited context size of LLMs, a fundamental question arises: Which examples should be selected to maximize performance on a given user query? While nearest-neighbor-based methods like KATE have been widely adopted for this purpose, they suffer from well-known drawbacks in high-dimensional embedding spaces, including poor generalization and a lack of diversity. In this work, we study this problem of example selection in ICL from a principled, information theory-driven perspective. We first model an LLM as a linear function over input embeddings and frame the example selection task as a query-specific optimization problem: selecting a subset of exemplars from a larger example bank that minimizes the prediction error on a specific query. This formulation departs from traditional generalization-focused learning theoretic approaches by targeting accurate prediction for a specific query instance. We derive a principled surrogate objective that is approximately submodular, enabling the use of a greedy algorithm with an approximation guarantee. We further enhance our method by (i) incorporating the kernel trick to operate in high-dimensional feature spaces without explicit mappings, and (ii) introducing an optimal design-based regularizer to encourage diversity in the selected examples. Empirically, we demonstrate significant improvements over standard retrieval methods across a suite of classification tasks, highlighting the benefits of structure-aware, diverse example selection for ICL in real-world, label-scarce scenarios.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Singh et al. (Fri,) studied this question.

www.synapsesocial.com/papers/68de5da783cbc991d0a20cc6 — DOI: https://doi.org/10.48550/arxiv.2509.15676

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

"In-Context Learning" or: How I learned to stop worrying and love "Applied Information Retrieval"· 2024 · 11 citations
Large Language Models Know What Makes Exemplary Contexts· 2024
Implicit In-context Learning· 2024 · 1 citations
In-Context Learning Demonstration Selection via Influence Analysis· 2024 · 3 citations
The Role of Diversity in In-Context Learning for Large Language Models· 2025

Authors

Vaibhav Singh

Soumya Suvra Ghosal

Kapu Nirmal Joshua

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion