Detecting Code Reuse in Android Applications Using Component-Based Control Flow Graph

Recently smartphones and mobile devices have gained incredible popularity for their vibrant feature-rich applications (or apps). Because it is easy to repackage Android apps, software plagiarism has become a serious problem. In this paper, we present an a

  • PDF / 486,859 Bytes
  • 14 Pages / 439.363 x 666.131 pts Page_size
  • 3 Downloads / 181 Views

DOWNLOAD

REPORT


Abstract. Recently smartphones and mobile devices have gained incredible popularity for their vibrant feature-rich applications (or apps). Because it is easy to repackage Android apps, software plagiarism has become a serious problem. In this paper, we present an accurate and robust system DroidSim to detect code reuse. DroidSim calculates similarity score only with component-based control flow graph (CB-CFG). CB-CFG is a graph of which nodes are Android APIs and edges represent control flow precedence order in each Android component. Our system can be applied to detect repackaged apps and malware variants. We evaluate DroidSim on 121 apps and 706 malware variants. The results show that our system has no false negative and a false positive of 0.83% for repackaged apps, and a detection ratio of 96.60% for malware variants. Besides, ADAM is used to obfuscate apps and the result reveals that ADAM has no influence on our system. Keywords: Mobile Applications, Code Reuse, Repackaging, Malware Variants.

1

Introduction

Smartphones have played a more and more important role in people’s life due to abundant and feature-rich smartphone applications (or apps) that people can download and experience from app repositories such as Apple’s App Store [2] and Google’s Google Play [3]. Recent statistics show that till the second quarter of 2013, Android dominates the mobile device market with 79.3% of market shares while the next closest platform iOS accounts for 13.2% of overall share [1]. Now Google Play has officially reached over 1 million apps and it has finally outgrown App Store [4]. Since users are no longer satisfied with a few functionalities like making phone calls or sending messages, they are willing to browse and download apps which can meet their other various demands. Code reuse occurs when different apps share the same code. It is often found in repackaged apps and malware variants. Users browse and download apps from markets. Developers submit apps to markets to make them available to users and accordingly gain profits from submitted apps. Therefore, a healthy ecosystem comes into being. Unfortunately, N. Cuppens-Boulahia et al. (Eds.): SEC 2014, IFIP AICT 428, pp. 142–155, 2014. c IFIP International Federation for Information Processing 2014 

Detecting Code Reuse in Android Applications Using CB-CFG

143

this ecosystem is mostly threatened by repackaged apps. A repackaged app emerges when a plagiarist unpacks a legitimate app, modifies certain code and redistributes it violating the intellectual property of original developer. Developers can directly charge for their apps, but many instead offer free apps and gain monetary profits from in-app billing or third-party ad libraries. Apps are repackaged for two motivations. First, a plagiarist can modify the ad library’s client ID or embed new ad libraries to steal or re-route ad revenues [10]. Second, malicious payloads or exploits may be injected into popular apps to increase propagation. Once installed, this kind of apps can leak privacy, send messages to premium numbers