Minimum geocoding match rates: an international study of the impact of data and areal unit sizes

View/ Open
File version
Version of Record (VoR)
Author(s)
Andresen, Martin A
Malleson, Nick
Steenbeek, Wouter
Townsley, Michael
Vandeviver, Christophe
Griffith University Author(s)
Year published
2020
Metadata
Show full item recordAbstract
The analysis of geographically referenced data, specifically point data, is predicated on the accurate geocoding of those data. Geocoding refers to the process in which geographically referenced data (addresses, for example) are placed on a map. This process may lead to issues with positional accuracy or the inability to geocode an address. In this paper, we conduct an international investigation into the impact of the (in)ability to geocode an address on the resulting spatial pattern. We use a variety of point data sets of crime events (varying numbers of events and types of crime), a variety of areal units of analysis ...
View more >The analysis of geographically referenced data, specifically point data, is predicated on the accurate geocoding of those data. Geocoding refers to the process in which geographically referenced data (addresses, for example) are placed on a map. This process may lead to issues with positional accuracy or the inability to geocode an address. In this paper, we conduct an international investigation into the impact of the (in)ability to geocode an address on the resulting spatial pattern. We use a variety of point data sets of crime events (varying numbers of events and types of crime), a variety of areal units of analysis (varying the number and size of areal units), from a variety of countries (varying underlying administrative systems), and a locally-based spatial point pattern test to find the levels of geocoding match rates to maintain the spatial patterns of the original data when addresses are missing at random. We find that the level of geocoding success depends on the number of points and the number of areal units under analysis, but generally show that the necessary levels of geocoding success are lower than found in previous research. This finding is consistent across different national contexts.
View less >
View more >The analysis of geographically referenced data, specifically point data, is predicated on the accurate geocoding of those data. Geocoding refers to the process in which geographically referenced data (addresses, for example) are placed on a map. This process may lead to issues with positional accuracy or the inability to geocode an address. In this paper, we conduct an international investigation into the impact of the (in)ability to geocode an address on the resulting spatial pattern. We use a variety of point data sets of crime events (varying numbers of events and types of crime), a variety of areal units of analysis (varying the number and size of areal units), from a variety of countries (varying underlying administrative systems), and a locally-based spatial point pattern test to find the levels of geocoding match rates to maintain the spatial patterns of the original data when addresses are missing at random. We find that the level of geocoding success depends on the number of points and the number of areal units under analysis, but generally show that the necessary levels of geocoding success are lower than found in previous research. This finding is consistent across different national contexts.
View less >
Journal Title
International Journal of Geographical Information Science
Volume
34
Issue
7
Copyright Statement
This is an Author's Accepted Manuscript of an article published in the International Journal of Geographical Information Science, 34 (7), pp. 1306-1322, 11 Feb 2020, copyright Taylor & Francis, available online at: https://doi.org/10.1080/13658816.2020.1725015
Subject
Physical geography and environmental geoscience
Human geography
Science & Technology
Social Sciences
Physical Sciences
Computer Science