A Method for Windows Malware Detection Based on Deep Learning

  • PDF / 1,721,753 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 112 Downloads / 292 Views

DOWNLOAD

REPORT


A Method for Windows Malware Detection Based on Deep Learning Xiang Huang 1 & Li Ma 1 & Wenyin Yang 1 & Yong Zhong 1 Received: 29 June 2020 / Revised: 27 July 2020 / Accepted: 5 August 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract As the Internet rapidly develops, the types and quantity of malware continue to diversify and increase, and the technology of evading security software is becoming more and more advanced. This paper proposes a malware detection method based on deep learning, which combines malware visualization technology with convolutional neural network. The structure of neural network is based on VGG16 network. This paper proposes the hybrid visualization of malware, combining static and dynamic analysis. In hybrid visualization, we use the Cuckoo Sandbox to carry out dynamic analysis on the samples, convert the dynamic analysis results into a visualization image according to a designed algorithm, and train the neural network on static and hybrid visualization images. Finally, we test the performance of the malware detection method we propose, evaluating its effectiveness on detecting unknown malware. Keywords Cybersecurity . Malware detection . Malware image . Convolutional neural network

1 Introduction In recent years, the Internet continues to develop rapidly and it plays an increasingly important role in the general public’s work and life. In the meantime, malicious software (malware) on the Internet is also proliferating, posing a great threat to the information security of Internet users. The earliest (and still widely used) malware detection technology was based on “signatures”, which are simply code sequences in a malware binary that uniquely identify it [1]. To detect malware, signatures are added to an antivirus program, which then matches them against the files it is scanning. If a match is found in a certain file, the file is determined to be malicious. Signature-based detection has the advantage of not easily producing false positives, because signatures are derived manually by professional malware analysts. However, it has a major * Li Ma [email protected] Xiang Huang [email protected] Wenyin Yang [email protected] Yong Zhong [email protected] 1

School of Electronic and Information Engineering, Foshan University, Foshan, China

drawback: it is only capable of detecting known malware [2]. When a new piece of malware appears, it will take time for the signatures to be extracted and added to the antivirus programs, leaving the users unprotected for a certain period. To address this problem, researchers proposed heuristic detection [3]. Heuristic detection was designed as a means to detect unknown malware by spotting suspicious characteristics in a program. For example, if a program is protected by some rarely seen “packer”, then it is determined to be suspicious. This technique can identify some new malware, at the cost of higher false positive rates. In light of the shortcomings of signature-based and heuristic detection, researchers and cybersec