Improved Estimation of Entropy for Evaluation of Word Sense Induction-Reference-Cited by-同舟云学术

Improved Estimation of Entropy for Evaluation of Word Sense Induction

Published:2014-09 Issue:3 Volume:40 Page:671-685
ISSN:0891-2017
Container-title:Computational Linguistics
language:en
Short-container-title:Computational Linguistics

Author:

Li Linlin¹,Titov Ivan²,Sporleder Caroline³

Affiliation:

1. Microsoft Development Center Norway

2. University of Amsterdam

3. Trier University

Abstract

Information-theoretic measures are among the most standard techniques for evaluation of clustering methods including word sense induction (WSI) systems. Such measures rely on sample-based estimates of the entropy. However, the standard maximum likelihood estimates of the entropy are heavily biased with the bias dependent on, among other things, the number of clusters and the sample size. This makes the measures unreliable and unfair when the number of clusters produced by different systems vary and the sample size is not exceedingly large. This corresponds exactly to the setting of WSI evaluation where a ground-truth cluster sense number arguably does not exist and the standard evaluation scenarios use a small number of instances of each word to compute the score. We describe more accurate entropy estimators and analyze their performance both in simulations and on evaluation of WSI systems.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Language and Linguistics

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00196

Reference27 articles.

1. Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm

2. Semeval-2007 task 02

3. A comparison of extrinsic clustering evaluation metrics based on formal constraints

4. Convergence properties of functional estimates for discrete distributions

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Comparative Analysis of Discrete Entropy Estimators for Large-Alphabet Problems;Entropy;2024-04-28