Malware Detection Using Machine Learning

Decision making using Machine Learning can be efficiently applied to security. Malware has become a big risk in today’s times. In order to provide protection for the same, we present a machine-learning based technique for predicting Windows PE files as be

  • PDF / 1,470,059 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 66 Downloads / 417 Views

DOWNLOAD

REPORT


2

NIT Patna, Patna, India {ajayk.phd18.cs,kumar.abhishek.cse}@nitp.ac.in Veermata Jijabai Technological Institute, Matunga, Mumbai, India {kshah b18,panerurkar p17}@ce.vjti.ac.in, {dspatel b17,ysjain b17,hkchheda b17}@it.vjti.ac.in 3 NMIMS Mumbai, Mumbai, India [email protected]

Abstract. Decision making using Machine Learning can be efficiently applied to security. Malware has become a big risk in today’s times. In order to provide protection for the same, we present a machine-learning based technique for predicting Windows PE files as benign or malignant based on fifty-seven of their attributes. We have used the Brazilian Malware dataset, which had around 1,00,000 samples and 57 labels. We have made seven models, and have achieved 99.7% accuracy for the Random Forest model, which is very high when compared to other existing systems. Thus using the Random Forest model one can make a decision on whether a particular file is malware or benign.

Keywords: Security

1

· Malware · Machine learning

Introduction

Decision making is an important process today in almost all domains. Cyber security is a particular field wherein we need to make decision on files whether they are malware or benign. Malware is malicious software or a program or a script which can be harmful to any instance of computing. These malicious programs are capable of performing multiple tasks, including data theft, encoding, or straight-away deleting sensitive data, modifying or hijacking basic system functionalities, and keeping track of (spying on) the actions performed by humans/human driven software on the computers [7,26]. Malware is the short form for ‘malicious software’, a technical word for noting some particular computer program or code which undertakes illegal tasks without the owner’s permission. In the last decade, the number of newly found malware has risen exponentially. As specified earlier, malware can have disastrous effects on the system if left uncontrolled. Therefore, we should always have a capable protection from c Springer Nature Switzerland AG 2020  B. Villaz´ on-Terrazas et al. (Eds.): KGSWC 2020, CCIS 1232, pp. 61–71, 2020. https://doi.org/10.1007/978-3-030-65384-2_5

62

A. Kumar et al.

such malignant programs. Currently we can group prevalent anti-malware mechanisms into three kinds: signature, behaviour and heuristics-based techniques. However, they cannot handle malware whose definitions are not known or the latest, frequently developed post discovery of a zero-day exploit. The number and types of malware are multiplying daily, and a dynamic system is essentially required for the classification of files as “malware” or “benign”. As per the statistics published by the Computer Economics 1, the fiscal damage caused by the various malware attacks has risen from 3.3 billion dollar in 1997 to 13.3 billion dollar in 2006. It is a huge jump. The definition of Year of Mega Breach must be manipulated every few years to inculcate protection to the attacks performed in that particular year [2]. The static features of malware when it is n

Data Loading...