It's not just the site, it's the contents: Intra-domain fingerprinting social mediawebsites through cdn bursts

Loading...
Thumbnail Image
File version

Version of Record (VoR)

Author(s)
Wang, K
Zhang, J
Bai, G
Ko, R
Dong, JS
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2021
Size
File type(s)
Location

Ljubljana Slovenia

License
Abstract

The website fingerprinting (or inter-domain WSF), enhanced by various machine learning techniques, has shown its power to identify websites a user has visited. To our best knowledge, a finer-grained problem of web page fingerprinting (or intra-domain WPF) has not been systematically studied by our research community. The WPF attackers, such as government agencies enforcing Internet censorship, are keen to identify the particular web pages (e.g., a political dissident's social media page) visited by the target user. In this work, we investigate the intra-domain WPF among social media websites, against the realistic on-path passive attack scenario. We reveal that delivering large-size data such as images and videos via Content Delivery Networks (CDNs), which is a common practice in social media websites, makes intra-domain WPF highly feasible. The network traffic generated during rendering a social media page exhibits temporal and volumetric patterns that are sufficiently recognizable by machine learning algorithms. We characterize such patterns as CDN bursts, and use features extracted from them to empower classification algorithms to achieve a high classification accuracy (96%) and a low false positive rate (0.02%).

Journal Title
Conference Title

WWW '21: Proceedings of the World Wide Web Conference

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2021 Association for Computing Machinery. This is an Open Access article posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in WWW '21: Proceedings of the Web Conference 2021, ISBN: 978-1-4503-8312-7, https://doi.org/10.1145/3442381.3450008

Item Access Status
Note
Access the data
Related item(s)
Subject

Information systems

Persistent link to this record
Citation

Wang, K; Zhang, J; Bai, G; Ko, R; Dong, JS, It's not just the site, it's the contents: Intra-domain fingerprinting social mediawebsites through cdn bursts, WWW '21: Proceedings of the World Wide Web Conference, 2021, pp. 2142-2153