Automated feature engineering for HTTP tunnel detection

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Davis, JJ
Foo, E
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2016
Size
File type(s)
Location
Abstract

Generating discriminative input features is a key requirement for achieving highly accurate classifiers. The process of generating features from raw data is known as feature engineering and it can take significant manual effort. In this paper we propose automated feature engineering to derive a suite of additional features from a given set of basic features with the aim of both improving classifier accuracy through discriminative features, and to assist data scientists through automation. Our implementation is specific to HTTP computer network traffic. To measure the effectiveness of our proposal, we compare the performance of a supervised machine learning classifier built with automated feature engineering versus one using human-guided features. The classifier addresses a problem in computer network security, namely the detection of HTTP tunnels. We use Bro to process network traffic into base features and then apply automated feature engineering to calculate a larger set of derived features. The derived features are calculated without favour to any base feature and include entropy, length and N-grams for all string features, and counts and averages over time for all numeric features. Feature selection is then used to find the most relevant subset of these features. Testing showed that both classifiers achieved a detection rate above 99.93% at a false positive rate below 0.01%. For our datasets, we conclude that automated feature engineering can provide the advantages of increasing classifier development speed and reducing development technical difficulties through the removal of manual feature engineering. These are achieved while also maintaining classification accuracy.

Journal Title

Computers and Security

Conference Title
Book Title
Edition
Volume

59

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2016 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited.

Item Access Status
Note
Access the data
Related item(s)
Subject

Information and computing sciences

Persistent link to this record
Citation

Davis, JJ; Foo, E, Automated feature engineering for HTTP tunnel detection, Computers and Security, 2016, 59, pp. 166-185

Collections