A two-pass approach for minimising error in synthetically generated network traffic data sets

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Soper, J
Xu, Y
Nguyen, K
Foo, E
Jadidi, Z
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2023
Size
File type(s)
Location

Melbourne, Australia

License
Abstract

Network security research requires network traffic data sets of sufficient size, variety, and completeness in order to perform tasks such as training intrusion detection systems. While the standard is to use testbeds to create data sets or capture data sets from real systems, Generative Adversarial Networks have proven successful in generating new packet samples for protocols such as ICMP, DNS, HTTP, and SIP. However, existing approaches have problems with quality evaluation due to insufficient sampling, or they require non-generalised criteria to be created specifically for the data set being trained on. This paper proposes a new and generalised two-pass approach to evaluating the quality of samples produced by the generator to produce a filtered, higher-quality output data set. Compared against SIP-GAN, which is a Generative Adversarial Network model targeting Session Initiation Protocol samples, we reduced the ratio of malformed SIP samples from between 9.6% and 19.8% down to 1.2%.

Journal Title
Conference Title

ACSW '23: Proceedings of the 2023 Australasian Computer Science Week

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2023. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACSW '23: Proceedings of the 2023 Australasian Computer Science Week, 979-8-4007-0005-7, http://doi.org/10.1145/3579375.3579378

Item Access Status
Note
Access the data
Related item(s)
Subject

Data management and data science

Information and computing sciences

Persistent link to this record
Citation

Soper, J; Xu, Y; Nguyen, K; Foo, E; Jadidi, Z, A two-pass approach for minimising error in synthetically generated network traffic data sets, ACSW '23: Proceedings of the 2023 Australasian Computer Science Week, 2023, pp. 18-27