Minimizing Efforts in Validating Crowd Answers

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Nguyen, Quoc Viet Hung
Duong, Chi Thang
Weidlich, Matthias
Aberer, Karl
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2015
Size
File type(s)
Location

Melbourne, AUSTRALIA

License
Abstract

In recent years, crowdsourcing has become essential in a wide range of Web applications. One of the biggest challenges of crowdsourcing is the quality of crowd answers as workers have wide-ranging levels of expertise and the worker community may contain faulty workers. Although various techniques for quality control have been proposed, a post-processing phase in which crowd answers are validated is still required. Validation is typically conducted by experts, whose availability is limited and who incur high costs. Therefore, we develop a probabilistic model that helps to identify the most beneficial validation questions in terms of both, improvement of result correctness and detection of faulty workers. Our approach allows us to guide the expert's work by collecting input on the most problematic cases, thereby achieving a set of high quality answers even if the expert does not validate the complete answer set. Our comprehensive evaluation using both real-world and synthetic datasets demonstrates that our techniques save up to 50% of expert efforts compared to baseline methods when striving for perfect result correctness. In absolute terms, for most cases, we achieve close to perfect correctness after expert input has been sought for only 20% of the questions.

Journal Title
Conference Title

SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© ACM, 2015. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, ISBN: 978-1-4503-3621-5, doi: 10.1145/2766462.2767866.

Item Access Status
Note
Access the data
Related item(s)
Subject

Database systems

Persistent link to this record
Citation