A Novel Integrated Classifier for Handling Data Warehouse Anomalies

Loading...
Thumbnail Image
File version
Author(s)
Darcy, Peter
Stantic, Bela
Sattar, Abdul
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Eder, J

Bielikova, M

Tjoa, AM

Date
2011
Size

118814 bytes

File type(s)

application/pdf

Location
License
Abstract

Within databases employed in various commercial sectors, anomalies continue to persist and hinder the overall integrity of data. Typically, Duplicate, Wrong and Missed observations of spatial-temporal data causes the user to be not able to accurately utilise recorded information. In literature, different methods have been mentioned to clean data which fall into the category of either deterministic and probabilistic approaches. However, we believe that to ensure the maximum integrity, a data cleaning methodology must have properties of both of these categories to effectively eliminate the anomalies. To realise this, we have proposed a method which relies both on integrated deterministic and probabilistic classifiers using fusion techniques. We have empirically evaluated the proposed concept with state-of-the-art techniques and found that our approach improves the integrity of the resulting data set.

Journal Title

Lecture Notes in Computer science

Conference Title
Book Title
Edition
Volume

6909

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2011 Springer Berlin / Heidelberg. This is the author-manuscript version of this paper. Reproduced in accordance with the copyright policy of the publisher. The original publication is available at www.springerlink.com

Item Access Status
Note
Access the data
Related item(s)
Subject

Coding, information theory and compression

Information and computing sciences

Persistent link to this record
Citation
Collections