A knowledge graph (KG) is structured information about the real world, which is formed by a collection of triplets of a head entity, a relation, and a tail entity. Although KGs can be automatically constructed from text or manually curated, they often suffer from misinformation, incompleteness, and noise. This study aims to improve the quality of a noisy KG by implementing a two-step process: (1) the detection of noisy (erroneous) triplets and (2) the refinement of these triplets by correcting the wrong entities. Leveraging recent advances in large language models (LLMs), we propose two methods for noise detection and refinement: LLMₛim and LLMᵣule. LLMₛim uses LLMs to assess the semantic plausibility of triples by measuring their similarity to the existing triplets. LLMᵣule automatically induces rules from an existing KG. These rules provide semantic constraints of head and tail entities for a given relation, and are used to identify noisy triplets and correct them. Results of experiments employing artificially constructed and authentic noisy KGs show that both LLMₛim and LLMᵣule perform well for both noise detection and refinement. Furthermore, our methods are applied to the downstream task of Knowledge Graph Completion (KGC). The results demonstrate that the refinement by LLMₛim and LLMᵣule yields substantial improvements in KGC performance.
Dong et al. (Thu,) studied this question.