Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction-Reference-Cited by-同舟云学术

Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction

Published:2013-09 Issue:3 Volume:39 Page:709-754
ISSN:0891-2017
Container-title:Computational Linguistics
language:en
Short-container-title:Computational Linguistics

Author:

Di Marco Antonio¹,Navigli Roberto¹

Affiliation:

1. Sapienza University of Rome

Abstract

Web search result clustering aims to facilitate information search on the Web. Rather than the results of a query being presented as a flat list, they are grouped on the basis of their similarity and subsequently shown to the user as a list of clusters. Each cluster is intended to represent a different meaning of the input query, thus taking into account the lexical ambiguity (i.e., polysemy) issue. Existing Web clustering methods typically rely on some shallow notion of textual similarity between search result snippets, however. As a result, text snippets with no word in common tend to be clustered separately even if they share the same meaning, whereas snippets with words in common may be grouped together even if they refer to different meanings of the input query.In this article we present a novel approach to Web search result clustering based on the automatic discovery of word senses from raw text, a task referred to as Word Sense Induction. Key to our approach is to first acquire the various senses (i.e., meanings) of an ambiguous query and then cluster the search results based on their semantic similarity to the word senses induced. Our experiments, conducted on data sets of ambiguous queries, show that our approach outperforms both Web clustering and search engines.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Language and Linguistics

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00148

Reference106 articles.

1. Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm

2. Two graph-based algorithms for state-of-the-art WSD

3. UBC-AS

4. Diversifying search results

5. Distributional Memory: A General Framework for Corpus-Based Semantics

Cited by 79 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A comprehensive survey on community detection methods and applications in complex information networks;Social Network Analysis and Mining;2024-04-18

2. Search Result Presentation for Non-Native Language Documents;Companion Proceedings of the 29th International Conference on Intelligent User Interfaces;2024-03-18

3. Reflective action selection based on positive-unlabeled learning and causality detection model;Computer Speech & Language;2023-03

4. DATA CLUSTERING BASED ON INDUCTIVE LEARNING OF NEURO-FUZZY NETWORK WITH DISTANCE HASHING;Radio Electronics, Computer Science, Control;2022-12-09

5. Word Sense Disambiguation Using Clustered Sense Labels;Applied Sciences;2022-02-11