Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision Based Approach

Loading...
Thumbnail Image
File version

Version of Record (VoR)

Author(s)
Liu, R
Lin, Y
Yang, X
Ng, SH
Divakaran, DM
Dong, JS
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2022
Size
File type(s)
Location

Boston, USA

License
Abstract

Explainable phishing detection approaches are usually based on references, i.e., they compare a suspicious webpage against a reference list of commonly targeted legitimate brands' webpages. If a webpage is detected as similar to any referenced website but their domains are not aligned, a phishing alert is raised with an explanation comprising its targeted brand. In comparison to other techniques, such explainable reference-based solutions are more robust to ever-changing phishing webpages. However, the webpage similarity is still measured by representations conveying only partial intentions (e.g., screenshot and logo), which (i) incurs considerable false positives and (ii) gives an adversary opportunities to compromise user confidence in the approaches. In this work, we propose, PhishIntention, to extract precise phishing intention of a webpage by visually (i) extracting its brand intention and credential-taking intention, and (ii) interacting with the webpage to confirm the credential-taking intention. We design PhishIntention as a heterogeneous system of deep learning vision models, overcoming various technical challenges. The models “look at” and “interact with” the webpage for its intention, which are robust to potential HTML obfuscation. We compare PhishIntention with four state-of-the-art reference-based approaches on the largest phishing identification dataset consisting of 50K phishing and benign webpages. For similar level of recall, PhishIntention achieves significantly higher precision than the baselines. Moreover, we conduct a continuous field study on the Internet for two months to discover emerging phishing webpages. PhishIntention detects 1,942 new phishing webpages (1,368 not reported by VirusTotal). Comparing to the best baseline, PhishIntention generates 86.5% less false alerts (139 vs. 1,033 false positives) while detecting similar number of real phishing webpages. Our models and code are available at https://github.com/lindsey98/PhishIntention.git.

Journal Title
Conference Title

Proceedings of the 31st USENIX Security Symposium, Security 2022

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2022 The Author(s). The attached file is reproduced here in accordance with the copyright policy of the publisher. Please refer to the conference's website for access to the definitive, published version.

Item Access Status
Note
Access the data
Related item(s)
Subject

Information and computing sciences

Software engineering

Cybersecurity and privacy

Computer vision

Persistent link to this record
Citation

Liu, R; Lin, Y; Yang, X; Ng, SH; Divakaran, DM; Dong, JS, Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision Based Approach, Proceedings of the 31st USENIX Security Symposium, Security 2022, 2022, pp. 1633-1650