Intrusion Detection in Computer Systems Using Multiple Classifier Systems

Multiple Classifier Systems (MCS) have been applied successfully in many different research fields, among them the detection of intrusions in computer systems. As an example, in the intrusion detection field, MCS may be motivated by the presence of differ

  • PDF / 394,456 Bytes
  • 23 Pages / 439.37 x 666.142 pts Page_size
  • 84 Downloads / 222 Views

DOWNLOAD

REPORT


Detection Systems (IDS) are employed to detect patterns related to computer system attacks. Attacks (respectively, legitimate actions) may be defined as actions performed by users accessing computer system services and resources deviating from (respectively, according to) the use they have been deployed for. Such actions are evaluated through their measurable features. Depending upon the type of input data, two types of IDS are currently used. Network-based IDS analyze the traffic of a computer network, whereas host-based IDS analyze audit data recorded by networked hosts. Traditionally, the design of IDS relies upon the expert, hand-written, definition of models describing either legitimate or attack patterns in computer systems [24, 30]. According to the employed intrusion detection model, an alert is produced if I. Corona et al.: Intrusion Detection in Computer Systems Using Multiple Classifier Systems, Studies in Computational Intelligence (SCI) 126, 91–113 (2008) c Springer-Verlag Berlin Heidelberg 2008 www.springerlink.com 

92

I. Corona et al.

a pattern is not included in the model of legitimate actions patterns (a.k.a. anomaly-based IDS), or if it is included in the models of attacks (a.k.a. misuse or signature-based IDS). In addition to the need for human expertise to model legitimate or attack patterns, it is time and effort expensive to guarantee the effectiveness of this approach, that is, attaining high detection rates and low false alarm rates. Finally, hand-written models offer poor novel (zeroday) attack detection capabilities. Especially, this is true for attack pattern models, that, by definition, describe only known attacks. Conversely, in principle, anomaly-based IDS are able to detect new attacks. The problem, in this case, is that it is difficult to produce effective models of legitimate patterns by hand. To cope with these problems, the intrusion detection task has been formulated as a pattern recognition task based on machine learning algorithms. However, a large number of patterns related to computer system events, at different places and abstraction levels have to be analyzed. Consequently, many features have to be considered, during the intrusion detection task. For example, sequences of operating system calls, executing applications, open IP ports, web server logs, users logged-in, file property changes, database logs, are host-side features. Furthermore, network traffic, i.e. packets exchanged between different hosts, must be analyzed. Particularly, many protocols, at different abstraction levels and with different semantic, have to be considered: ICMP, IP, TCP, UDP, HTTP, FTP, SMTP, IMAP are some examples. Patterns extracted from this heterogeneous domain are very difficult to characterize. In fact, it is very difficult to take into account the domain knowledge as a whole, in the design of the classifier. In general, by increasing the number of features, the pattern recognition task become even more complex and less tractable. Furthermore, the larger the number of features, the larger the number of trainin