SCEVD: Semantic-enhanced Code Embedding for Vulnerability Discovery
File version
Author(s)
Xu, Yue
Foo, Ernest
Gauravaram, Praveen
Jadidi, Zahra
Simpson, Leonie
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Wuhan, China
License
Abstract
Source code vulnerability detection is a major goal in security research. In recent years, deep learning methods have been applied to this end, however the task of embedding code into vector representations as input for deep learning models has yet to be definitively solved. The use of graphs, specifically Abstract Syntax Trees and Code Property Graphs, is a promising research direction for this task, however learning from graphs grows prohibitively computationally expensive for large graphs. No close examination of intelligent ways to prune this input to only vulnerability-relevant information has yet been performed. Additionally, most existing works focus largely on structural information from graphs, often neglecting information contained within the nodes themselves. We address these gaps in the prior research by proposing SCEVD: a deep learning model for vulnerability discovery which utilises semantic information to intelligently select features in source code graphs for learning. It uses information contained within code graph nodes, as well as information about their relationships with one another to select the code graph features which are most relevant to code vulnerability. We implement SCEVD and conduct experiments using the SARD Juliet test suite, finding that we are able to improve vulnerability discovery results using this process of semantic-enhanced code graph feature selection.
Journal Title
Conference Title
2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Data security and protection
code representation
deep learning
source code semantics
Terms vulnerability discovery
Persistent link to this record
Citation
Gear, J; Xu, Y; Foo, E; Gauravaram, P; Jadidi, Z; Simpson, L, SCEVD: Semantic-enhanced Code Embedding for Vulnerability Discovery, 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), 2022, pp. 1522-1527