What question did this study set out to answer?

The aim is to improve handwritten text recognition for the Hungarian language using advanced transformer models.

April 15, 2026Open Access

Enhancing Transformer-Based Language Models for Hungarian Handwritten Text Recognition.

Key Points

The aim is to improve handwritten text recognition for the Hungarian language using advanced transformer models.
Developed a hybrid Hungarian and English model for OCR.
Utilized pre-trained models like TrOCR, Roberta-base, and PULI-BERT for recognition tasks.
Fine-tuned models on human-annotated data with and without augmentation.
Leveraged synthetic data from a three-million-line text corpus for pre-training.
Achieved a character error rate (CER) of 3.681 with TrOCR large handwritten model.
Attained a word error rate (WER) of 16.091 using PULI-BERT with the Deit model.
Fine-tuned models surpass current state-of-the-art results on Hungarian historical handwriting.

Abstract

Optical Character Recognition (OCR) is still working on making a multilingual model that incorporates the Hungarian language. We introduce a hybrid Hungarian and English model, one of the biggest challenges is to recognize handwritten text. We are going to investigate a set of models in this research, such as TrOCR large-handwritten, leveraging PULI-BERT, and Roberta-base with Diet models. The digitization of documents, and the preservation of cultural heritage specifically, has long been a research problem related to text recognition. We use an extensive text on the recognition approach using pre-trained visual and language transformer models. We pre-train the TrOCR proposed by Microsoft researchers for both large and base models at the first phase and then fine-tune them on human data at the second stage. Then, leverage new pre-trained transformers models such as Roberta-base, and PULI-BERT, as decoders and Diet, Vit, and Beit as encoder models at the pre-training phase on generated synthetic data and then fine-tune them on a small amount of human-annotated data provided by (DH-Lab) researchers with augmentation and without augmentation. Developed using tiny-scale Synthetic data of around three-million-line text open-source corpus, and subsequently refined using tiny person-labeled datasets. Experiments showed that the best CER is 3.681 in the TrOCR large handwritten, and the best WER is 16.091 by leveraging the PULI-BERT with the Deit model. These fine-tuned models outperform the currently existing state-of-the-art TrOCR models on historical Hungarian handwriting, according to the benchmark results on the János Arany dataset.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Al-Hitawi et al. (Thu,) studied this question.

www.synapsesocial.com/papers/69df2ba0e4eeef8a2a6b090e — DOI: https://doi.org/10.12688/f1000research.176408.2

Authors

Mohammed A.S Al-Hitawi

Natabara Máté Gyöngyössy

Actions

Institutions

Eötvös Loránd University

University Of Fallujah

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Enhancing Transformer-Based Language Models for Hungarian Handwritten Text Recognition.

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion