On discovery of functional dependencies from data
Author(s)
Liu, Jixue
Ye, Feiyue
Li, Jiuyong
Wang, Junhu
Griffith University Author(s)
Year published
2013
Metadata
Show full item recordAbstract
Discovering functional dependencies (FDs) from existing databases is important to knowledge discovery, machine learning and data quality assessment. A number of algorithms have been proposed in the literature. In this paper, we review and compare these algorithms to identify their advantages and differences. We then propose a simple but time and space efficient hash-based algorithm for FD discovery. We conduct a performance comparison of three recently published algorithms and compare their performance with that of our hash-based algorithm. We show that the hash-based algorithm performs best among the four algorithms and ...
View more >Discovering functional dependencies (FDs) from existing databases is important to knowledge discovery, machine learning and data quality assessment. A number of algorithms have been proposed in the literature. In this paper, we review and compare these algorithms to identify their advantages and differences. We then propose a simple but time and space efficient hash-based algorithm for FD discovery. We conduct a performance comparison of three recently published algorithms and compare their performance with that of our hash-based algorithm. We show that the hash-based algorithm performs best among the four algorithms and analyze the reasons.
View less >
View more >Discovering functional dependencies (FDs) from existing databases is important to knowledge discovery, machine learning and data quality assessment. A number of algorithms have been proposed in the literature. In this paper, we review and compare these algorithms to identify their advantages and differences. We then propose a simple but time and space efficient hash-based algorithm for FD discovery. We conduct a performance comparison of three recently published algorithms and compare their performance with that of our hash-based algorithm. We show that the hash-based algorithm performs best among the four algorithms and analyze the reasons.
View less >
Journal Title
Data and Knowledge Engineering
Volume
86
Subject
Data management and data science
Information systems
Database systems