Show simple item record

dc.contributor.authorAlves Peixoto, Douglas
dc.contributor.authorQuoc Viet Nguyen, H
dc.contributor.authorZheng, Bolong
dc.contributor.authorZhou, Xiaofang
dc.date.accessioned2020-01-07T04:40:53Z
dc.date.available2020-01-07T04:40:53Z
dc.date.issued2019
dc.identifier.issn0926-8782
dc.identifier.doi10.1007/s10619-018-7254-0
dc.identifier.urihttp://hdl.handle.net/10072/390088
dc.description.abstractMap-matching is a problem of matching recorded GPS trajectories to a digital representation of the road network. GPS data may be inaccurate and heterogeneous, due to limitations or error on electronic sensors, as well as law restrictions. How to accurately match trajectories to the road map is an important preprocessing step for many real-world applications, such as trajectory data mining, traffic analysis, and routes prediction. However, the high availability of GPS trajectories and map data challenges the scalability of current map-matching algorithms, which are limited for small datasets since they focus only on the accuracy of the matching rather than scalability. Therefore, we propose a distributed parallel framework for efficient and scalable offline map-matching on top of the Spark framework. Spark uses distributed in-memory data storage and the MapReduce paradigm to achieve horizontal scaling and fast computation of large datasets. Spark, however, is still limited for dynamic map-matching, and memory consumption in Spark can be an issue for very large datasets. We develop a framework to allow map-matching on top os Spark, while achieving horizontal scalability, memory-wise usage, and maintaining the accuracy of state-of-the-art matching algorithms by: (1) We combine a sampling-based Quadtree spatial partitioning construction and batch-based computation to achieve horizontal scalability of map-matching, as well as reduce cluster memory usage. (2) We employ a safe spatial-boundary approach to preserve matching accuracy of boundary objects. (3) In addition, a cost function for the distributed map-matching workload is provided in order to tune the framework parameters. Our extensive experiments demonstrate that our framework is efficient and scalable to process map-matching on large-scale data, while keeping matching accuracy and low memory usage.
dc.description.peerreviewedYes
dc.languageEnglish
dc.language.isoeng
dc.publisherSpringer Science
dc.publisher.placeUnited States
dc.relation.ispartofpagefrom697
dc.relation.ispartofpageto720
dc.relation.ispartofissue4
dc.relation.ispartofjournalDistributed and Parallel Databases
dc.relation.ispartofvolume37
dc.subject.fieldofresearchData Format
dc.subject.fieldofresearchDistributed Computing
dc.subject.fieldofresearchcode0804
dc.subject.fieldofresearchcode0805
dc.titleA framework for parallel map-matching at scale using Spark
dc.typeJournal article
dc.type.descriptionC2 - Articles (Other)
dcterms.bibliographicCitationAlves Peixoto, D; Quoc Viet Nguyen, H; Zheng, B; Zhou, X, A framework for parallel map-matching at scale using Spark, Distributed and Parallel Databases, 2019, 37 (4), pp. 697-720
dc.date.updated2020-01-07T04:09:27Z
gro.hasfulltextNo Full Text
gro.griffith.authorNguyen, Henry


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record