This letter proposes a new Kuzushiji transcription framework that integrates optical character recognition (OCR) with read-speech automatic speech recognition (ASR) via hiragana-level fusion, without requiring additional model training. The framework uses the transcriber’s read-speech as an additional modality to guide beam-search OCR hypothesis selection for Kuzushiji transcription. Each OCR candidate is scored based on its phonetic similarity to the ASR output of the corresponding Kuzushiji read-speech at the hiragana-sequence level. Evaluation results show the effectiveness of the proposed framework in reducing the character error rate in contrast to conventional OCR-only Kuzushiji transcription.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yutao Zhang
Shiori Totsuka
Yuting Geng
Nippon Onkyo Gakkaishi/Acoustical science and technology/Nihon Onkyo Gakkaishi
Ritsumeikan University
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69d8930e6c1944d70ce04188 — DOI: https://doi.org/10.1250/ast.e25.110
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: