What question did this study set out to answer?

The review aims to survey current large language models and their architectures, including methods of fine-tuning and applications.

April 22, 2026Open Access

Large Language Models: Architectures, Fine-Tuning, and Retrieval-Augmented Generation — A Comprehensive Review

Key Points

The review aims to survey current large language models and their architectures, including methods of fine-tuning and applications.
Survey of transformer-based architectures including BERT, GPT, and T5.
Discussion of fine-tuning techniques and retrieval-augmented generation strategies.
Overview of challenges and future research directions.
Highlights advancements in model architectures and pre-training techniques.
Identifies key challenges like hallucination and computational cost.
Examines implications of alignment techniques and parameter-efficient fine-tuning.

Abstract

Large Language Models (LLMs) have emerged as the dominant paradigm for natural language understanding and generation, progressing from encoder-only and encoder–decoder transformers to frontier decoder-only models with tens to hundreds of billions of parameters. This review presents a comprehensive survey of modern LLMs, organised around five interrelated themes: (i) transformer-based architectures and major model families including BERT, GPT, T5, LLaMA, Mistral, Claude, and Gemini; (ii) pre-training paradigms, scaling laws, and data curation practices; (iii) fine-tuning strategies with particular emphasis on parameter-efficient methods such as LoRA, QLoRA, adapters, and prefix tuning; (iv) retrieval-augmented generation (RAG) pipelines that ground LLM outputs in external knowledge; and (v) alignment techniques including supervised fine-tuning, reinforcement learning from human feedback (RLHF), and direct preference optimisation (DPO). We additionally cover inference-time efficiency, evaluation benchmarks, and real-world applications. The review concludes with key challenges such as hallucination, safety, reasoning limits, and computational cost, and highlights future research directions including mixture-of-experts, long-context modeling, and multimodal extensions.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Allmin Fatima

Saima Aleem

Tasleem Jamal

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Large Language Models: Architectures, Fine-Tuning, and Retrieval-Augmented Generation — A Comprehensive Review

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study