Discovering tasks from search engine query logs-Reference-Cited by-同舟云学术

Discovering tasks from search engine query logs

Published:2013-07 Issue:3 Volume:31 Page:1-43
ISSN:1046-8188
Container-title:ACM Transactions on Information Systems
language:en
Short-container-title:ACM Trans. Inf. Syst.

Author:

Lucchese Claudio¹,Orlando Salvatore²,Perego Raffaele¹,Silvestri Fabrizio¹,Tolomei Gabriele²

Affiliation:

1. ISTI-CNR, Pisa, Italy

2. Università Ca' Foscari Venezia, Italy

Abstract

Although Web search engines still answer user queries with lists of ten blue links to webpages, people are increasingly issuing queries to accomplish their daily tasks (e.g., finding a recipe , booking a flight , reading online news , etc.). In this work, we propose a two-step methodology for discovering tasks that users try to perform through search engines. First, we identify user tasks from individual user sessions stored in search engine query logs. In our vision, a user task is a set of possibly noncontiguous queries (within a user search session), which refer to the same need. Second, we discover collective tasks by aggregating similar user tasks, possibly performed by distinct users. To discover user tasks, we propose query similarity functions based on unsupervised and supervised learning approaches. We present a set of query clustering methods that exploit these functions in order to detect user tasks. All the proposed solutions were evaluated on a manually-built ground truth, and two of them performed better than state-of-the-art approaches. To detect collective tasks, we propose four methods that cluster previously discovered user tasks, which in turn are represented by the bag-of-words extracted from their composing queries. These solutions were also evaluated on another manually-built ground truth.

Funder

European Commission

Ministero dell'Istruzione, dell'Università e della Ricerca

Seventh Framework Programme

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Link

https://dl.acm.org/doi/pdf/10.1145/2493175.2493179

Reference53 articles.

1. Design trade-offs for search engine caching

2. Baeza-Yates R. and Ribeiro-Neto B. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing Co. Inc. Boston MA. Baeza-Yates R. and Ribeiro-Neto B. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing Co. Inc. Boston MA.

3. Agglomerative clustering of a search engine query log

4. The query-flow graph

5. Broder A. 2002. A taxonomy of Web search. SIGIR Forum 36 2 2 3--10. 10.1145/792550.792552 Broder A. 2002. A taxonomy of Web search. SIGIR Forum 36 2 2 3--10. 10.1145/792550.792552

Cited by 46 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Graph-SeTES: A graph based search task extraction using Siamese network;Information Sciences;2024-04

2. Session-Based Time-Window Identification in Virtual Learning Environments;Journal of Learning Analytics;2023-12-15

3. Query sampler: generating query sets for analyzing search engines using keyword research tools;PeerJ Computer Science;2023-06-07

4. Recommending tasks based on search queries and missions;Natural Language Engineering;2023-05-17

5. Representing Tasks with a Graph-Based Method for Supporting Users in Complex Search Tasks;Proceedings of the 2023 Conference on Human Information Interaction and Retrieval;2023-03-19