Automatic record linkage of individuals and households in historical census data
MetadataShow full item record
Linking historical census data is an important task for the study of the social, economic, and demographic aspects of families and society in the past. Although various (semi-) automatic linking methods have been proposed, state-of-the-art methods have only been targeted at linking records that correspond to individuals. In this paper, we introduce an automatic method aimed at linking both individuals and households across several historical census datasets. The proposed method contains several steps, including data quality analysis and enhancement, household identity detection, as well as individual and household record linking. We have applied this method to a set of six census datasets collected from the district of Rawtenstall in North-East Lancashire in the United Kingdom between 1851 and 1901. Experimental results show that the proposed method can greatly reduce the ambiguity arising from the individual record linkage, and facilitate the accurate matching of households across several decades.
International Journal of Humanities and Arts Computing
© 2014 Edinburgh University Press. This is the author-manuscript version of this paper. Reproduced in accordance with the copyright policy of the publisher. Please refer to the journal's website for access to the definitive, published version.
Pattern Recognition and Data Mining