What question did this study set out to answer?

This survey aims to address the challenges in performing inference for large language models at the network edge.

April 26, 2026Open Access

Network Edge Inference for Large Language Models: Principles, Techniques, and Opportunities

Puntos clave

This survey aims to address the challenges in performing inference for large language models at the network edge.
Conducted a comprehensive survey of recent advancements in LLM edge inference.
Reviewed system architectures, model optimization techniques, and resource management strategies.
Synthesis of state-of-the-art techniques and identification of future research directions.
Highlighted significant challenges related to memory and compute demands for LLMs in edge environments.
Provided an overview of optimization strategies that enable more efficient LLM deployment.
Mapped future directions that could enhance the feasibility of LLMs in resource-constrained settings.

Resumen

Large language models (LLMs) have advanced rapidly, emerging as versatile tools across fields thanks to their exceptional language understanding, generation, and reasoning capabilities. However, performing LLM inference at the network edge remains challenging due to their large memory and compute demands. This survey outlines the challenges specific to LLM edge inference and provides a comprehensive overview of recent progress, covering system architectures, model optimization and deployment, and resource management and scheduling. By synthesizing state-of-the-art techniques and mapping future directions, this survey aims to unlock the potential of LLMs in resource-constrained edge environments.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Zhixiong Chen

Bingjie Zhu

Jiangzhou Wang

Journals

ACM Computing Surveys

Actions

Institutions

Nanyang Technological University

Queen Mary University of London

Kyung Hee University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Network Edge Inference for Large Language Models: Principles, Techniques, and Opportunities

Puntos clave

Resumen

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider