Statistical quality control framework for crowd-worker in ER-in-house crowdsourcing system

No Thumbnail Available
File version
Author(s)
Saberi, M
Hussain, OK
Chang, E
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2015
Size
File type(s)
Location

Massachusetts, USA

License
Abstract

These days, poor data quality is prevalent in organizations. This poor quality negatively effect on accuracy of organization decision making. The problem of dirty data is more severe for organization’s customer relationship management (CRM) and prevents it from effective performance. One type of dirty data is duplicate records that correspond to the same entities. Presence of duplicate profiles in an organization’s database prevents an organization to have a clear picture of customers’ profile. Thus, developing efficient Entity resolution (ER) technique in a given organization is essential. Recently, crowdsourcing technique has been used to improve the accuracy of entity resolution process that make use of human intelligence to label the data and make it ready for further processing by entity resolution (ER) algorithms. However, labelling of data by humans is an error prone process that affects the process of entity resolution and eventually overall performance of crowd. Thus controlling the quality of labeling task is an essential for crowdsourcing systems. However, this task becomes more challenging due to unavailability of ground data. In this study, we focus on contact centers and employ Customer Service Representatives (CSRs) as crowd-worker for ER-Crowdsourcing system. A statistical quality control (SQC) framework is proposed to control the quality of CSRs labeling. The proposed SQC framework should be able to estimate the true error of CSRs in order to monitor their labeling accuracy and performance. To this end, a Hybrid Gold- plurality (HGP) Algorithm is proposed that estimate CSR’s true error. The proposed HGP algorithm is capable of an appropriate accuracy in error estimation as it is composed of both Masking and Detection crowd-worker quality control mechanisms. Synthetic dataset is used to demonstrate the applicability of the SQC framework.

Journal Title
Conference Title

Proceedings of the 2015 MIT ICIQ

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation

Saberi, M; Hussain, OK; Chang, E, Statistical quality control framework for crowd-worker in ER-in-house crowdsourcing system, Proceedings of the 2015 MIT ICIQ, 2015, pp. 88-104