Crowdsourcing Human Annotation on Web Page Structure-Reference-Cited by-同舟云学术

Crowdsourcing Human Annotation on Web Page Structure

Published:2016-07-14 Issue:4 Volume:7 Page:1-25
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

Han Shuguang¹,Dai Peng²,Paritosh Praveen²,Huynh David²

Affiliation:

1. University of Pittsburgh, Pittsburgh, PA

2. Google

Abstract

Parsing the semantic structure of a web page is a key component of web information extraction. Successful extraction algorithms usually require large-scale training and evaluation datasets, which are difficult to acquire. Recently, crowdsourcing has proven to be an effective method of collecting large-scale training data in domains that do not require much domain knowledge. For more complex domains, researchers have proposed sophisticated quality control mechanisms to replicate tasks in parallel or sequential ways and then aggregate responses from multiple workers. Conventional annotation integration methods often put more trust in the workers with high historical performance; thus, they are called performance-based methods. Recently, Rzeszotarski and Kittur have demonstrated that behavioral features are also highly correlated with annotation quality in several crowdsourcing applications. In this article, we present a new crowdsourcing system, called Wernicke, to provide annotations for web information extraction. Wernicke collects a wide set of behavioral features and, based on these features, predicts annotation quality for a challenging task domain: annotating web page structure. We evaluate the effectiveness of quality control using behavioral features through a case study where 32 workers annotate 200 Q&A web pages from five popular websites. In doing so, we discover several things: (1) Many behavioral features are significant predictors for crowdsourcing quality. (2) The behavioral-feature-based method outperforms performance-based methods in recall prediction, while performing equally with precision prediction. In addition, using behavioral features is less vulnerable to the cold-start problem, and the corresponding prediction model is more generalizable for predicting recall than precision for cross-website quality analysis. (3) One can effectively combine workers’ behavioral information and historical performance information to further reduce prediction errors.

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2870649

Reference31 articles.

1. Soylent

2. A Survey of Web Information Extraction Systems

3. POMDP-based control of workflows for crowdsourcing

4. And Now for Something Completely Different

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Application Research of Dynamic Load Test Based on Sensor Technology in Bridge Detection and Evaluation;2023 Third International Conference on Digital Data Processing (DDP);2023-11-27

2. A Model for Cognitive Personalization of Microtask Design;Sensors;2023-03-29

3. A Misreport- and Collusion-Proof Crowdsourcing Mechanism Without Quality Verification;IEEE Transactions on Mobile Computing;2022-09-01

4. A Survey on Task Assignment in Crowdsourcing;ACM Computing Surveys;2022-02-03

5. Quality Control in Crowdsourcing based on Fine-Grained Behavioral Features;Proceedings of the ACM on Human-Computer Interaction;2021-10-13