May 14, 2026Open Access

Semantic convergence in culturally loaded text translation by Large Language Models: a cross-model empirical analysis of English translations of The Four Books

Key Points

Key points are not available for this paper at this time.

Abstract

as its corpus to construct a sentence-aligned parallel corpus. Employing sentence embeddings and cosine similarity, it systematically analyzes the semantic convergence of English translations of culturally loaded texts generated by four Large Language Models (LLMs): ChatGPT-5, Google Translate, Deepseek-V3.2, and Ernie Bot-5. The findings reveal that: (1) the translations from different models exhibit a high degree of overall semantic consistency, with the average cosine similarity for twenty core Confucian concepts all exceeding 0.73, indicating a significant trend of cross-model semantic convergence; (2) there are notable differences in stability among concepts, with those having clear referents and well-defined semantic boundaries demonstrating higher stability, while abstract concepts with greater interpretive latitude show more pronounced divergence; (3) systematic strategic divergences exist among the LLMs, with pairwise similarity distributions revealing differing orientations between cultural preservation and functional interpretation. Furthermore, analysis of cosine similarity identified that low-similarity outliers primarily stem from semantic divergence of polysemous words, differences in handling cultural-specific items, divergent translation strategies, and local context misinterpretations, reflecting the mechanisms of semantic variation in specific contexts. Grounded in the internal structure of the text, this study proposes a multi-layered analytical framework for cross-model semantic convergence, "sentence-level alignment- vector computation- concept aggregation," providing methodological support for quantitative research on LLM translation of culturally loaded texts. It offers an empirical foundation for understanding the capability boundaries, strategic orientations, and potential risks of cultural meaning simplification in LLMs' cross-cultural semantic representations.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

wan et al. (Thu,) studied this question.

synapsesocial.com/papers/6a1eaf47989adebfe89a72c8 https://doi.org/https://doi.org/10.3389/fpsyg.2026.1829488

Bookmark

View Full Paper