Understanding the Yarowsky Algorithm-Reference-Cited by-同舟云学术

Understanding the Yarowsky Algorithm

Published:2004-09 Issue:3 Volume:30 Page:365-395
ISSN:0891-2017
Container-title:Computational Linguistics
language:en
Short-container-title:Computational Linguistics

Author:

Abney Steven¹

Affiliation:

1. University of Michigan, 4080 Frieze Bldg., 105 S. State Street, Ann Arbor, MI 48109-1285.

Abstract

Many problems in computational linguistics are well suited for bootstrapping (semisupervised learning) techniques. The Yarowsky algorithm is a well-known bootstrapping algorithm, but it is not mathematically well understood. This article analyzes it as optimizing an objective function. More specifically, a number of variants of the Yarowsky algorithm (though not the original algorithm itself) are shown to optimize either likelihood or a closely related objective function K.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Language and Linguistics

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/0891201041850876

Reference5 articles.

1. Abney, Steven. 2002. Bootstrapping. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, pages 360-367.

2. Blum, Avrim and Tom Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT), pages 92-100. Morgan Kaufmann, San Francisco.

3. Collins, Michael and Yoram Singer. 1999. Unsupervised models for named entity classification. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP), College Park, MD, pages 100-110.

4. Dasgupta, Sanjoy, Michael Littman, and David McAllester. 2001. PAC generalization bounds for co-training. In Proceedings of Advances in Neural Information Processing Systems 14 (NIPS), Vancouver, British Columbia, Canada.

5. Yarowsky, David. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, pages 189-196.

Cited by 50 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. PSNEA: Pseudo-Siamese Network for Entity Alignment between Multi-modal Knowledge Graphs;Proceedings of the 31st ACM International Conference on Multimedia;2023-10-26

2. Analysis of cardiac single-cell RNA-sequencing data can be improved by the use of artificial-intelligence-based tools;Scientific Reports;2023-04-26

3. Intelligent fault identification strategy of photovoltaic array based on ensemble self-training learning;Solar Energy;2023-01

4. Machine learning analysis to classify nanoparticles from noisy spICP-TOFMS data;Journal of Analytical Atomic Spectrometry;2023

5. Semi-supervised learning for quality control of high-value wood products;Wood Science and Technology;2022-09