Affiliation:
1. Massachusetts Institute of Technology, USA
2. Google, USA
Abstract
When searching on the web or in an app, results are often returned as lists of hundreds to thousands of items, making it difficult for users to understand or navigate the space of results. Research has demonstrated that using clustering to partition search results into coherent, topical clusters can aid in both exploration and discovery. Yet clusters generated by an algorithm for this purpose are often of poor quality and do not satisfy users. To achieve acceptable clustered search results, experts must manually evaluate and refine the clustered results for each search query, a process that does not scale to large numbers of search queries. In this article, we investigate using crowd-based human evaluation to inspect, evaluate, and improve clusters to create high-quality clustered search results at scale. We introduce a workflow that begins by using a collection of well-known clustering algorithms to produce a set of clustered search results for a given query. Then, we use crowd workers to holistically assess the quality of each clustered search result to find the best one. Finally, the workflow has the crowd spot and fix problems in the best result to produce a final output. We evaluate this workflow on 120 top search queries from the Google Play Store, some of whom have clustered search results as a result of evaluations and refinements by experts. Our evaluations demonstrate that the workflow is effective at reproducing the evaluation of expert judges and also improves clusters in a way that agrees with experts and crowds alike.
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Human-Computer Interaction
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. DynamicLabels: Supporting Informed Construction of Machine Learning Label Sets with Crowd Feedback;Proceedings of the 29th International Conference on Intelligent User Interfaces;2024-03-18
2. Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections;Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems;2023-04-19
3. Goldilocks: Consistent Crowdsourced Scalar Annotations with Relative Uncertainty;Proceedings of the ACM on Human-Computer Interaction;2021-10-13
4. Sifter: A Hybrid Workflow for Theme-based Video Curation at Scale;ACM International Conference on Interactive Media Experiences;2020-06-17