Word sense induction (WSI) aims to automatically discover the different senses of a word from contextual usage without predefined sense inventories. However, existing distributional clustering methods often suffer from dominant-sense bias and struggle to correctly identify minority senses. In this paper, we propose a definition-anchored reclassification framework for WSI that leverages large language models (LLMs) to generate explicit sense descriptions and refine cluster assignments. Unlike purely distributional approaches, our method integrates semantic definitions into the induction process. Our method improves instance-level alignment by introducing a trade-off with global structural consistency, as it shifts the decision process from geometric clustering to definition-based semantic matching. Experiments on the SemEval-2010 and SemEval-2013 datasets demonstrate that the proposed method consistently outperforms traditional clustering baselines and existing WSI systems across both structural metrics (NMI and V-measure) and instance-level metrics (F-B3 and Fuzzy-F-B3). In particular, our approach effectively mitigates dominant-sense bias and improves the recovery of minority senses by preserving them as distinct clusters while correctly assigning their instances. These results suggest that explicit semantic representations generated by LLMs provide a promising direction for addressing long-standing challenges in unsupervised word sense induction. Furthermore, unlike purely distributional clustering approaches, our method explicitly introduces LLM-generated semantic definitions as anchors, enabling more robust mitigation of dominant-sense bias and improved recall of minority senses.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shota Yoshikawa
Minoru Sasaki
Applied Sciences
Ibaraki University
Building similarity graph...
Analyzing shared references across papers
Loading...
Yoshikawa et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69df2b04e4eeef8a2a6b0020 — DOI: https://doi.org/10.3390/app16083797