Deep Learning for Coverage-Guided Fuzzing: How Far are We?

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Li, S
Xie, X
Lin, Y
Li, Y
Feng, R
Li, X
Ge, W
Dong, JS
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2022
Size
File type(s)
Location
License
Abstract

Fuzzing is a widely-used software vulnerability discovery technology, many of which are optimized using coverage-feedback. Recently, some techniques propose to train deep learning (DL) models to predict the branch coverage of an arbitrary input owing to its always-available gradients etc. as a guide. Those techniques have proved their success in improving coverage and discovering bugs under different experimental settings. However, DL models, usually as a magic black-box, are notoriously lack of explanation. Moreover, their performance can be sensitive to the collected runtime coverage information for training, indicating potentially unstable performance. In this work, we conduct a systematic empirical study on 4 types of DL models across 6 projects to (1) revisit the performance of DL models on predicting branch coverage (2) demystify what specific knowledge do the models exactly learn, (3) study the scenarios where the DL models can outperform and underperform the traditional fuzzers, and (4) gain insight into the challenges of applying DL models on fuzzing. Our empirical results reveal that existing DL-based fuzzers do not perform well as expected, which is largely affected by the dependencies between branches, unbalanced sample distribution, and the limited model expressiveness. In addition, the estimated gradient information tends to be less helpful in our experiments. Finally, we further pinpoint the research directions based on our summarized challenges.

Journal Title

IEEE Transactions on Dependable and Secure Computing

Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Item Access Status
Note

This publication has been entered in Griffith Research Online as an advanced online version.

Access the data
Related item(s)
Subject

Applied computing

Cybersecurity and privacy

Distributed computing and systems software

Persistent link to this record
Citation

Li, S; Xie, X; Lin, Y; Li, Y; Feng, R; Li, X; Ge, W; Dong, JS, Deep Learning for Coverage-Guided Fuzzing: How Far are We?, IEEE Transactions on Dependable and Secure Computing, 2022

Collections