A First Look at the Effect of Deep Learning in Coverage-guided Fuzzing
Author(s)
Li, S
Lin, Y
Xie, X
Li, Y
Li, X
Ge, W
Liu, Y
Dong, J
Griffith University Author(s)
Year published
2021
Metadata
Show full item recordAbstract
Fuzzing has been a widely-used technique for discovering software vulnerabilities. Many existing fuzzers leverage coverage-feedback to evolve seeds to maximize (optimize) program branch coverage. Recently, some techniques propose to train deep learning models to predict the branch coverage of an arbitrary input. Those techniques have proved their success in improving coverage and discovering bugs under different experimental settings. However, deep learning models, usually as a black magic box, are notoriously lack of explanation. Moreover, their performance can be sensitive to the collected runtime coverage information for ...
View more >Fuzzing has been a widely-used technique for discovering software vulnerabilities. Many existing fuzzers leverage coverage-feedback to evolve seeds to maximize (optimize) program branch coverage. Recently, some techniques propose to train deep learning models to predict the branch coverage of an arbitrary input. Those techniques have proved their success in improving coverage and discovering bugs under different experimental settings. However, deep learning models, usually as a black magic box, are notoriously lack of explanation. Moreover, their performance can be sensitive to the collected runtime coverage information for training, indicating potentially unstable performance. To this end, in this work we conduct a systematic and extensive empirical study on 4 types of deep learning models across 6 projects to reproduce the actual performance of deep learning fuzzers, analyze the advantages and disadvantages of deep learning in the process of fuzzing applications, and explore the future direction of the combination of the two. Our empirical results reveal that the deep learning models can only be effective in very limited scenarios, which is largely restrained by training data imbalance, dependant labels, model over-generalization, and the insufficient expressiveness of the state-of-the-art models. Consequently, the estimated gradients by the models to cover a branch can be less helpful in many scenarios.
View less >
View more >Fuzzing has been a widely-used technique for discovering software vulnerabilities. Many existing fuzzers leverage coverage-feedback to evolve seeds to maximize (optimize) program branch coverage. Recently, some techniques propose to train deep learning models to predict the branch coverage of an arbitrary input. Those techniques have proved their success in improving coverage and discovering bugs under different experimental settings. However, deep learning models, usually as a black magic box, are notoriously lack of explanation. Moreover, their performance can be sensitive to the collected runtime coverage information for training, indicating potentially unstable performance. To this end, in this work we conduct a systematic and extensive empirical study on 4 types of deep learning models across 6 projects to reproduce the actual performance of deep learning fuzzers, analyze the advantages and disadvantages of deep learning in the process of fuzzing applications, and explore the future direction of the combination of the two. Our empirical results reveal that the deep learning models can only be effective in very limited scenarios, which is largely restrained by training data imbalance, dependant labels, model over-generalization, and the insufficient expressiveness of the state-of-the-art models. Consequently, the estimated gradients by the models to cover a branch can be less helpful in many scenarios.
View less >
Conference Title
Proceedings - 2021 36th IEEE/ACM International Conference on Automated Software Engineering, ASE 2021
Subject
Software engineering