Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning-Reference-Cited by-同舟云学术

Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning

Published:2022-06-29 Issue: Volume:5 Page:
ISSN:2624-8212
Container-title:Frontiers in Artificial Intelligence
language:
Short-container-title:Front. Artif. Intell.

Author:

Yasmin Romena,Hassan Md Mahmudulla,Grassel Joshua T.,Bhogaraju Harika,Escobedo Adolfo R.,Fuentes Olac

Abstract

This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the (x, y)-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods.

Funder

U.S. Department of Homeland Security

National Science Foundation

Publisher

Frontiers Media SA

Subject

Artificial Intelligence

Reference76 articles.

1. Stochastic optimization of plain convolutional neural networks with simple methods;Assiri;arXiv [Preprint] arXiv:2001.08856,2020

2. Crowdsourcing earthquake damage assessment using remote sensing imagery;Barrington;Ann. Geophys,2012

3. Revisiting resnets: improved training and scaling strategies;Bello;arXiv [Preprint] arXiv:2103.07579,2021

4. Handbook of Computational Social Choice

5. Revolt: collaborative crowdsourcing for labeling machine learning datasets;Chang,2017

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Intersection of machine learning and mobile crowdsourcing: a systematic topic-driven review;Personal and Ubiquitous Computing;2024-06-10

2. From pixels to insights: Machine learning and deep learning for bioimage analysis;BioEssays;2023-12-06

3. Assessing the Effects of Expanded Input Elicitation and Machine Learning-Based Priming on Crowd Stock Prediction;Advances in Computational Collective Intelligence;2023