Pay-as-you-go reconciliation in schema matching networks

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Nguyen, Quoc Viet Hung
Nguyen, Thanh Tam
Miklos, Zoltan
Aberer, Karl
Gal, Avigdor
Weidlich, Matthias
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Gabriel Ghinita, Ali Inan

Date
2014
Size
File type(s)
Location

Chicago, IL, United States

License
Abstract

Schema matching is the process of establishing correspondences between the attributes of database schemas for data integration purposes. Although several automatic schema matching tools have been developed, their results are often incomplete or erroneous. To obtain a correct set of correspondences, a human expert is usually required to validate the generated correspondences. We analyze this reconciliation process in a setting where a number of schemas needs to be matched, in the presence of consistency expectations about the network of attribute correspondences. We develop a probabilistic model that helps to identify the most uncertain correspondences, thus allowing us to guide the expert's work and collect his input about the most problematic cases. As the availability of such experts is often limited, we develop techniques that can construct a set of good quality correspondences with a high probability, even if the expert does not validate all the necessary correspondences. We demonstrate the efficiency of our techniques through extensive experimentation using real-world datasets.

Journal Title
Conference Title

Proceedings of the 2014 IEEE 30th International Conference on Data Engineering

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Item Access Status
Note
Access the data
Related item(s)
Subject

Database systems

Persistent link to this record
Citation