Noisy values detection and correction of traffic accident data
View/ Open
File version
Accepted Manuscript (AM)
Author(s)
Deb, Rupam
Liew, Alan Wee-Chung
Griffith University Author(s)
Year published
2019
Metadata
Show full item recordAbstract
Death, injury, and disability from road traffic crashes continue to be a major global public health problem. Therefore, methods to reduce accident severity are of significant interest to traffic agencies and the public at large. Noisy data in the traffic accident dataset obscure the discovery of important factors and mislead conclusions. Identifying and correcting noisy values is an important goal of data cleansing and preprocessing. This paper proposes a new algorithm called NoiseCleaner to identify and correct noisy categorical attributes values in large traffic accident datasets. We evaluate our algorithm using four ...
View more >Death, injury, and disability from road traffic crashes continue to be a major global public health problem. Therefore, methods to reduce accident severity are of significant interest to traffic agencies and the public at large. Noisy data in the traffic accident dataset obscure the discovery of important factors and mislead conclusions. Identifying and correcting noisy values is an important goal of data cleansing and preprocessing. This paper proposes a new algorithm called NoiseCleaner to identify and correct noisy categorical attributes values in large traffic accident datasets. We evaluate our algorithm using four publicly available traffic accident datasets from Australia and United States, namely, two road crash datasets from the Queensland Government data depository (data.qld.gov.au) and two datasets from the New York's open data portal (data.ny.gov). We compare our technique with several existing state-of-the-art methods and show that our algorithm performs significantly better than the existing algorithms.
View less >
View more >Death, injury, and disability from road traffic crashes continue to be a major global public health problem. Therefore, methods to reduce accident severity are of significant interest to traffic agencies and the public at large. Noisy data in the traffic accident dataset obscure the discovery of important factors and mislead conclusions. Identifying and correcting noisy values is an important goal of data cleansing and preprocessing. This paper proposes a new algorithm called NoiseCleaner to identify and correct noisy categorical attributes values in large traffic accident datasets. We evaluate our algorithm using four publicly available traffic accident datasets from Australia and United States, namely, two road crash datasets from the Queensland Government data depository (data.qld.gov.au) and two datasets from the New York's open data portal (data.ny.gov). We compare our technique with several existing state-of-the-art methods and show that our algorithm performs significantly better than the existing algorithms.
View less >
Journal Title
Information Sciences
Volume
476
Copyright Statement
© 2019 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited.
Subject
Mathematical sciences
Engineering