What question did this study set out to answer?

The aim is to evaluate the effectiveness of a VLM-OCR pipeline for processing a 16th-century Florentine treatise with unique typographic features.

April 13, 2026Open Access

A 2.48% CER VLM-OCR Pipeline for a Heavily Accented 1544 Florentine Treatise

Key Points

The aim is to evaluate the effectiveness of a VLM-OCR pipeline for processing a 16th-century Florentine treatise with unique typographic features.
Utilized a LoRA fine-tune of LightOnOCR-2-1B as the OCR engine.
Measured character error rate (CER) on a 7-page hand-corrected set.
Conducted training on 16 pages and evaluated on 151 pages for full-book inference.
Achieved a character error rate of 2.48% on the treated text.
Compared to a zero-shot baseline of 8.12% and a custom HTR model reaching 15.66% CER.
Fine-tuned model preserved 26% more accented characters than the Transkribus baseline.

Abstract

Pierfrancesco Giambullari’s Del sito, forma, total compute cost for all experiments, including full-book inference on 151 pages, was US6. 96. For context, a zero-shot Claude Sonnet 4. 5 baseline reaches 8. 12% CER on the same gold standard, and the author’s first-pass Transkribus PyLaia custom HTR model (20 manually annotated training pages) reaches 15. 66% CER; both numbers are reported as reference points, not as a head-to-head ranking, and the Transkribus baseline would likely improve with more effort. The headline of this note is not that one OCR stack “wins”, but that off-the-shelf VLM-OCR with a small LoRA fine-tune, produced by a single person on an iPhone-scanned book in one day, reaches a CER suitable for critical-edition workflows on exactly the kind of idiosyncratic primary-source typography where zero-shot frontier models still degrade. A complementary finding concerns stress-accent coverage: the fine-tuned LightOn model preserves +26% more accented characters than the Transkribus baseline corpus-wide, a difference that matters for a Dortelata edition where stress accents are lexically load-bearing. We release this note ahead of the forthcoming critical edition to document the pipeline and the accent-coverage finding.

A 2.48% CER VLM-OCR Pipeline for a Heavily Accented 1544 Florentine Treatise

Key Points

Abstract

Cite This Study