Los puntos clave no están disponibles para este artículo en este momento.
We describe and evaluate experimentally a method for clustering words according to their distribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the similarity measure for clustering. Clusters are represented by average context distributions derived from the given words according to their probabilities of cluster membership. In many cases, the clusters can be thought of as encoding coarse sense distinctions. Deterministic annealing is used to find lowest distortion sets of clusters: as the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.
Building similarity graph...
Analyzing shared references across papers
Loading...
Pereira et al. (Fri,) studied this question.
www.synapsesocial.com/papers/6a08e977afc616802fe4b7ce — DOI: https://doi.org/10.3115/981574.981598
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Fernando Pereira
Naftali Tishby
Lillian Lee
Cornell University
Hebrew University of Jerusalem
AT&T (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...