March 3, 2026Open Access

Semantic-Driven and Rule-Based Noise Detection and Refinement in Knowledge Graphs using Large Language Models

Key Points

Refinement using LLM_sim and LLM_rule significantly boosts knowledge graph completion performance.
LLM_sim achieves high accuracy in determining semantic plausibility of triplets, improving data integrity.
Application of LLM_rule successfully identifies and corrects erroneous triplets via induced semantic rules.
Both methods demonstrate strong performance on artificial and authentic noisy knowledge graphs, validating their effectiveness.

Abstract

A knowledge graph (KG) is structured information about the real world, which is formed by a collection of triplets of a head entity, a relation, and a tail entity. Although KGs can be automatically constructed from text or manually curated, they often suffer from misinformation, incompleteness, and noise. This study aims to improve the quality of a noisy KG by implementing a two-step process: (1) the detection of noisy (erroneous) triplets and (2) the refinement of these triplets by correcting the wrong entities. Leveraging recent advances in large language models (LLMs), we propose two methods for noise detection and refinement: LLMₛim and LLMᵣule. LLMₛim uses LLMs to assess the semantic plausibility of triples by measuring their similarity to the existing triplets. LLMᵣule automatically induces rules from an existing KG. These rules provide semantic constraints of head and tail entities for a given relation, and are used to identify noisy triplets and correct them. Results of experiments employing artificially constructed and authentic noisy KGs show that both LLMₛim and LLMᵣule perform well for both noise detection and refinement. Furthermore, our methods are applied to the downstream task of Knowledge Graph Completion (KGC). The results demonstrate that the refinement by LLMₛim and LLMᵣule yields substantial improvements in KGC performance.

Bookmark

View Full Paper

Bookmark

View Full Paper

Semantic-Driven and Rule-Based Noise Detection and Refinement in Knowledge Graphs using Large Language Models

Key Points

Abstract

Cite This Study