SemiDroid: a behavioral malware detector based on unsupervised machine learning techniques using feature selection appro

  • PDF / 9,059,401 Bytes
  • 43 Pages / 595.276 x 790.866 pts Page_size
  • 63 Downloads / 197 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

SemiDroid: a behavioral malware detector based on unsupervised machine learning techniques using feature selection approaches Arvind Mahindru1,2   · A. L. Sangal1 Received: 22 October 2019 / Accepted: 5 November 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract With the exponential growth in Android apps, Android based devices are becoming victims of target attackers in the “silent battle” of cybernetics. To protect Android based devices from malware has become more complex and crucial for academicians and researchers. The main vulnerability lies in the underlying permission model of Android apps. Android apps demand permission or permission sets at the time of their installation. In this study, we consider permission and API calls as features that help in developing a model for malware detection. To select appropriate features or feature sets from thirty different categories of Android apps, we implemented ten distinct feature selection approaches. With the help of selected feature sets we developed distinct models by using five different unsupervised machine learning algorithms. We conduct an experiment on 5,00,000 distinct Android apps which belongs to thirty distinct categories. Empirical results reveals that the model build by considering rough set analysis as a feature selection approach, and farthest first as a machine learning algorithm achieved the highest detection rate of 98.8% to detect malware from real-world apps. Keywords  Android apps · Permissions model · API calls · Unsupervised · Feature selection · Intrusion detection · Cyber security · Smartphone

1 Introduction Detection of malware from smartphones has become a major concern for the research community. At the end of 2019, the number of Android users will be 3.3 billions throughout the world.1 Android is based on the Linux kernel and provide useful services such as security configuration, process management and others. The primary reason for the growth of Android operating system is due to its open-nature and freely available apps. At the end of July 2019,2 Android had 2.7 billion free and paid apps in its play store. There is an increase of 13%,3 in downloading of apps from Google play store with respect to previous years. Android operating system is based on the principle of privilege-separated where each app has its own distinct system identity, i.e., group-ID * Arvind Mahindru [email protected] 1



Department of Computer Science and Engineering, Dr. B.R. Ambedkar National Institute of Technology, Jalandhar 144011, India



Department of Computer Science and Applications, D.A.V. University, Sarmastpur, Jalandhar 144012, India

2

and Linux user-ID. Each app run in a procedure sandbox and access permission to use the resources which are not present in its sandbox. Depending on the permission sensitivity, the system automatically grants permission or may prompt users to approve or reject requests for permission. Permissions granted by users include, access to the calendar, camera, body s