April 7, 2026Open Access

TCMI-F-6D benchmark construction and quantitative assessment of interdisciplinary foundational competencies in traditional Chinese medicine informatics using large language models

Key Points

Key points are not available for this paper at this time.

Abstract

Introduction: Traditional Chinese Medicine Informatics (TCMI), as an emerging interdisciplinary field, places high demands on foundational interdisciplinary competency assessment in its talent cultivation and research practices. However, Large Language Models (LLMs) currently lack a suitable quantitative assessment system tailored to the characteristics of TCMI. Methods: To address this gap, this study, grounded in Cognitive Hierarchy Theory and Disciplinary Knowledge Structure Theory, selected six core disciplines closely related to TCMI from the Massive Multitask Language Understanding (MMLU) dataset, constructed an evaluation framework for foundational interdisciplinary competency in TCMI-related scenarios, and established the TCMI-F-6D (the TCMI-Foundation-6 Domain Benchmark) together with a composite metric system. Three experiments were conducted to evaluate the models' baseline capability, learning gains, and performance stability. The experiments comprehensively assessed the competency of 20 LLMs across 8 categories, and selected 6 models with weaker overall performance for focused analysis of their interdisciplinary competency characteristics. Results: The results showed that, among the base models, ChatGLM3-6B performed best in interdisciplinary knowledge integration (43.97%), while DeepSeek-V3.1 achieved the best overall application performance (80.87%) among the chat models. Specifically, Qwen-14B-Chat also demonstrated stable and predictable learning performance under varying example conditions, with an average learning gain of 5.60% and a 95% confidence interval (CI) of 5.50%, 5.70%. Discussion: Collectively, this study clarifies the differences in foundational interdisciplinary competency among LLMs in this discipline, providing a quantifiable assessment framework, methodological support, and empirical evidence for TCMI's educational, research tool selection, and the implementation of a standardized interdisciplinary competency assessment system.

Bookmark

View Full Paper

Bookmark

View Full Paper

TCMI-F-6D benchmark construction and quantitative assessment of interdisciplinary foundational competencies in traditional Chinese medicine informatics using large language models

Key Points

Abstract

Cite This Study