Autoencoder-based models have become a fundamental component of unsupervised and self-supervised learning in natural language processing (NLP), enabling models to learn compact latent representations through input reconstruction. From early denoising autoencoders to probabilistic variational autoencoders (VAEs) and transformer-based masked autoencoding, reconstruction-driven objectives have played a significant role in shaping modern approaches to text representation and generation. This review provides a comprehensive analysis of the evolution of autoencoder architectures and training objectives in NLP, and synthesizes applications of VAEs across language modeling, controllable text generation, machine translation, sentiment modeling, and multilingual representation learning. Although previous surveys have examined deep generative models or representation learning in NLP, there remains a lack of a unified review that systematically connects classical autoencoder variants, variational formulations, and modern transformer-based masked autoencoders within a single conceptual framework. To address this gap, this work consolidates architectural developments, training objectives, and major application domains under a reconstruction-based learning perspective, offering a structured comparison of modeling choices, datasets, and evaluation practices. Our analysis highlights the strengths and limitations of existing approaches, discusses the ongoing influence of autoencoder-style learning in NLP, and outlines future research directions focused on improving training stability, designing more structured latent spaces, and enhancing multilingual representation learning.
Building similarity graph...
Analyzing shared references across papers
Loading...
Moussa Redah
Wasfi G. Al-Khatib
Computers
King Fahd University of Petroleum and Minerals
Building similarity graph...
Analyzing shared references across papers
Loading...
Redah et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69d895d86c1944d70ce07041 — DOI: https://doi.org/10.3390/computers15040232