Data representation for CNN based internet traffic classification: a comparative study
- PDF / 3,876,086 Bytes
- 27 Pages / 439.642 x 666.49 pts Page_size
- 54 Downloads / 167 Views
Data representation for CNN based internet traffic classification: a comparative study Ola Salman1
· Imad H. Elhajj1 · Ayman Kayssi1 · Ali Chehab1
Received: 1 January 2020 / Revised: 6 July 2020 / Accepted: 28 July 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract It has been well established that the Internet of Things will bring an expansion in traffic volume and types. This will bring new challenges in terms of Quality of Service (QoS) and security, requiring innovative traffic management techniques. Traffic classification is a main network function that helps in managing both QoS and security. Different machine learning based methods have been applied for this aim. However, traditional machine learning methods rely on hand crafted features, limiting the model ability to learn. Deep Learning (DL), a branch of machine learning, is characterized by its representation learning ability. In this paper, we analyse two methods of data representation for DL-based classification: a raw packet-based representation and a quasi-raw flow-based representation. Different tests are performed to evaluate the robustness of these data representation methods. The tests include features’ importance, model robustness, and anonymization tests. The results show that raw data representation suffers from traffic anonymization and the fact that many packet fields are data-dependent. On the other hand, the flow-based representation is sensitive to the number of packets used for classification and to traffic obfuscation. Keywords Deep learning · Internet of things · Traffic classification · Data representation
1 Introduction The Internet of Things includes a heterogeneous set of connected devices that run different types of applications. These devices and applications will generate different types of traffic, Ola Salman
[email protected] Imad H. Elhajj [email protected] Ayman Kayssi [email protected] Ali Chehab [email protected] 1
American University of Beirut, Beirut 1107 2020, Lebanon
Multimedia Tools and Applications
having different requirements in terms of Quality of Service (QoS) and security. Managing both QoS and security in this high-scale network calls for innovative network management techniques [59]. In this context, traffic classification is considered as an essential element for traffic engineering, security management, traffic trends analysis, and so on [15]. The ability to classify the traffic based on the different requirements in terms of bandwidth, latency, throughput, etc., enables the allocation of the corresponding resources for each type of traffic and thus, guarantee good QoS [4, 5, 7, 24]. On the other hand, traffic classification techniques can be used to detect abnormal traffic [7, 39, 67]. Furthermore, Intrusion Detection Systems (IDSs) are using machine learning to reveal the attack name/type [28]. Internet traffic consists of the flow of data between the different communication parties. The Internet Protocol (IP) network traffic dominates other Internet traffic typ
Data Loading...