SCEVD: Semantic-enhanced Code Embedding for Vulnerability Discovery

No Thumbnail Available
File version
Author(s)
Gear, Joseph
Xu, Yue
Foo, Ernest
Gauravaram, Praveen
Jadidi, Zahra
Simpson, Leonie
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2022
Size
File type(s)
Location

Wuhan, China

License
Abstract

Source code vulnerability detection is a major goal in security research. In recent years, deep learning methods have been applied to this end, however the task of embedding code into vector representations as input for deep learning models has yet to be definitively solved. The use of graphs, specifically Abstract Syntax Trees and Code Property Graphs, is a promising research direction for this task, however learning from graphs grows prohibitively computationally expensive for large graphs. No close examination of intelligent ways to prune this input to only vulnerability-relevant information has yet been performed. Additionally, most existing works focus largely on structural information from graphs, often neglecting information contained within the nodes themselves. We address these gaps in the prior research by proposing SCEVD: a deep learning model for vulnerability discovery which utilises semantic information to intelligently select features in source code graphs for learning. It uses information contained within code graph nodes, as well as information about their relationships with one another to select the code graph features which are most relevant to code vulnerability. We implement SCEVD and conduct experiments using the SARD Juliet test suite, finding that we are able to improve vulnerability discovery results using this process of semantic-enhanced code graph feature selection.

Journal Title
Conference Title

2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Data security and protection

code representation

deep learning

source code semantics

Terms vulnerability discovery

Persistent link to this record
Citation

Gear, J; Xu, Y; Foo, E; Gauravaram, P; Jadidi, Z; Simpson, L, SCEVD: Semantic-enhanced Code Embedding for Vulnerability Discovery, 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), 2022, pp. 1522-1527