What question did this study set out to answer?

The study aims to assess how well open LLMs can identify taxonomic relationships between biomedical concepts using a standardized methodology.

April 1, 2026Open Access

Assessing Open LLMs’ Ability to Identify Biomedical Taxonomic Relationships: A SNOMED CT-Based Experimental Evaluation

Puntos clave

The study aims to assess how well open LLMs can identify taxonomic relationships between biomedical concepts using a standardized methodology.
Created a dataset to evaluate taxonomic relationship identification capabilities of various open LLMs.
Utilized SNOMED CT as a primary taxonomy source for evaluation.
Investigated factors influencing accuracy in identifying taxonomic relationships.
Applied chain-of-thought prompting techniques to enhance LLM performance.
Open LLMs often succeed in recognizing taxonomic links from their pre-training.
LLMs struggle with directional reasoning in challenging cases involving reversed relations.
Chain-of-thought prompting significantly improves performance in identifying hierarchical relationships.

Resumen

• This study evaluates how effectively open and general purpose LLMs can determine whether two biomedical concepts are taxonomically related. • Utilizing SNOMED CT as the primary taxonomy source, we present a procedure for creating a dataset specifically designed to test taxonomic relationship identification capabilities of various open LLMs. • We investigate the factors that influence the accuracy of identification by our set of LLMs following a reproducible methodology. Ontologies serve as semantic blueprints for knowledge management by capturing information in a coherent machine-processable format. They define concepts and relationships, commonly represented through knowledge graphs (KG) in which taxonomic, or “is-a”, relationships arrange concepts into hierarchical structures. Biomedical applications particularly benefit from these structured representations because of the domain’s inherent complexity and continual evolution. Structured representations of relationships support automated reasoning and inference, which are crucial for clinical decision-making, research hypothesis generation, and data integration tasks. Although a substantial portion of biomedical knowledge remains in natural language, Large Language Models (LLMs) offer new potential to automatically extract and interpret this information. Despite promising results in various natural language processing tasks, few studies have examined how effectively LLMs recognise taxonomic relationships. This study evaluates the ability of general-purpose LLMs to reason about biomedical taxonomies by identifying hierarchical “is-a” relationships between concepts. To operationalise this evaluation, we use the SNOMED CT Knowledge graph, one of the most comprehensive clinical terminologies, as a gold-standard reference for determining whether candidate concept pairs are taxonomically linked. Overall, LLMs often succeed in recognising domain-specific taxonomic links based solely on their generic pre-training, yet they exhibit weaknesses in directional reasoning, particularly in challenging negative cases where true parent–child relations are intentionally reversed. Our findings reveal that employing chain-of-thought prompting techniques significantly improves their performance in interpreting these relationships. Taken together, the results highlight both the benefits of Chain-of-Thought prompting for hierarchical judgments and the practical feasibility of integrating LLMs into algorithmic knowledge-graph workflows that require structured, machine-interpretable outputs.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo