A Static Birthmark of Binary Executables Based on API Call Structure
A software birthmark is a unique characteristic of a program that can be used as a software theft detection. In this paper we suggest and empirically evaluate a static birthmark of binary executables based on API call structure. The program properties emp
- PDF / 865,884 Bytes
- 15 Pages / 430 x 660 pts Page_size
- 72 Downloads / 183 Views
Abstract. A software birthmark is a unique characteristic of a program that can be used as a software theft detection. In this paper we suggest and empirically evaluate a static birthmark of binary executables based on API call structure. The program properties employed in this birthmark are functions and standard API calls when the functions are executed. The API calls from a function includes the API calls explicitly found from the function and its descendants within limited depth in the call graph. To statically identify functions, call graphs and API calls, we utilizes IDAPro disassembler and its plug-ins. We define the similarity between two functions as the proportion of the number of all API calls to the number of the common API calls. The similarity between two programs is obtained by the maximum weight bipartite matching between two programs using the function similarity matrix. To show the credibility of the proposed techniques, we compare the same applications with different versions and the various types of applications which include text editors, picture viewers, multimedia players, P2P applications and ftp clients. To show the resilience, we compare binary executables compiled from various compilers. The empirical result shows that the similarities obtained using our birthmark sufficiently indicates the functional and structural similarities among programs. Keyword: software piracy, software birthmark, binary analysis.
1
Introduction
Recently a large amount of software is developed in the form of open source projects. Most open source projects contain software licenses. A widely used software license for open source software is the GNU Public License(GPL). The GPL allows developers to use software freely, but requires new projects using the original work to be licensed under the GPL. There are also more permissive software licenses like the MIT license and the BSD licenses which allow the original source code to be combined in commercial software. The permissive licenses, however, require the copyright notice of the original software to be included.
This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the Advanced Information Technology Research Center(AITrc).
I. Cervesato (Ed.): ASIAN 2007, LNCS 4846, pp. 2–16, 2007. c Springer-Verlag Berlin Heidelberg 2007
A Static Birthmark of Binary Executables
3
There have been reported that many companies use open source software for commercial purpose without permission. To detect code theft when source code is available, we can utilize well-known plagiarism detection tools like MOSS, JPlag and YAP [1,2,3]. Suppose that source code under the GPL is contained in commercial software, which is distributed in compiled binaries without indicating the copyright notice of the original software. In this case, we need to prove whether the open source code is used or not in the binary executables. Software birthmarking is one of the techniques to solve such software theft problems. A software birthmark is unique characteristics
Data Loading...