Show simple item record

dc.contributor.authorMiao, Y
dc.contributor.authorHan, J
dc.contributor.authorGao, Y
dc.contributor.authorZhang, B
dc.date.accessioned2020-02-24T00:10:15Z
dc.date.available2020-02-24T00:10:15Z
dc.date.issued2019
dc.identifier.issn0167-8655
dc.identifier.doi10.1016/j.patrec.2019.04.012
dc.identifier.urihttp://hdl.handle.net/10072/391797
dc.description.abstractThe task of crowd counting and density maps estimating from videos is challenging due to severe occlusions, scene perspective distortions and diverse crowd distributions. Conventional crowd counting methods via deep learning technique process each video frame independently with no consideration of the intrinsic temporal correlation among neighboring frames, thus making the performance lower than the required level of real-world applications. To overcome this shortcoming, a new end-to-end deep architecture named Spatial-Temporal Convolutional Neural Network (ST-CNN) is proposed, which unifies 2D convolutional neural network (C2D) and 3D convolutional neural network (C3D) to learn spatial-temporal features in the same framework. On top of that, a merging scheme is performed on the resulting density maps, taking advantages of the spatial-temporal information simultaneously for the crowd counting task. Experimental results on two benchmark data sets â Mall dataset and WorldExpo′10 dataset show that our ST-CNN outperforms the state-of-the-art models in terms of mean absolutely error (MAE) and mean squared error (MSE).
dc.description.peerreviewedYes
dc.languageEnglish
dc.language.isoeng
dc.publisherElsevier
dc.relation.ispartofpagefrom113
dc.relation.ispartofpageto118
dc.relation.ispartofjournalPattern Recognition Letters
dc.relation.ispartofvolume125
dc.subject.fieldofresearchArtificial Intelligence and Image Processing
dc.subject.fieldofresearchElectrical and Electronic Engineering
dc.subject.fieldofresearchCognitive Sciences
dc.subject.fieldofresearchcode0801
dc.subject.fieldofresearchcode0906
dc.subject.fieldofresearchcode1702
dc.titleST-CNN: Spatial-Temporal Convolutional Neural Network for crowd counting in videos
dc.typeJournal article
dc.type.descriptionC1 - Articles
dcterms.bibliographicCitationMiao, Y; Han, J; Gao, Y; Zhang, B, ST-CNN: Spatial-Temporal Convolutional Neural Network for crowd counting in videos, Pattern Recognition Letters, 2019, 125, pp. 113-118
dc.date.updated2020-02-24T00:09:22Z
gro.hasfulltextNo Full Text
gro.griffith.authorGao, Yongsheng


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record