Combining RDR-Based Machine Learning Approach and Human Expert Knowledge for Phishing Prediction
Detecting phishing websites has been noted as complex and dynamic problem area because of the subjective considerations and ambiguities of detection mechanism. We propose a novel approach that uses Ripple-down Rule (RDR) to acquire knowledge from human ex
- PDF / 1,430,947 Bytes
- 13 Pages / 439.37 x 666.14 pts Page_size
- 81 Downloads / 220 Views
)
School of Engineering and ICT, Hobart, TAS 7005, Australia {David.Chung,renjiec,Soyeon.Han,Byeong.Kang}@utas.edu.au
Abstract. Detecting phishing websites has been noted as complex and dynamic problem area because of the subjective considerations and ambiguities of detec‐ tion mechanism. We propose a novel approach that uses Ripple-down Rule (RDR) to acquire knowledge from human experts with the modified RDR model-gener‐ ating algorithm (Induct RDR), which applies machine-learning approach. The modified algorithm considers two different data types (numeric and nominal) and also applies information theory from decision tree learning algorithms. Our experimental results showed the proposing approach can help to deduct the cost of solving over-generalization and over-fitting problems of machine learning approach. Three models were included in comparison: RDR with machine learning and human knowledge, RDR machine learning only and J48 machine learning only. The result shows the improvements in prediction accuracy of the knowledge acquired by machine learning. Keywords: Phishing prediction, RDR · Knowledge-based system · Machine learning · Decision tree
1
Introduction
An accelerative growth of Internet-based financing increases online fraudulent activity in which malicious people tries to reveal sensitive information of Internet users, also called as phishing. Phishing detection has received great attention but there has been limited research on a way of overall success due to the nature of problems. The problems of detecting phishing websites are very complex and hard to analyze as technical and social problems are joining each other [1]. Either machine learning technique and human expert system has been applied to acquire and maintain the knowledge for phishing website detection and prediction while the results do not show significance. A large number of knowledge-based systems are built for acquiring and maintaining the knowl‐ edge for detecting and predicting the phishing website. Phishing website detection knowledge was originally acquired from domain experts. However, acquiring knowl‐ edge from an expert in a slow pace cannot meet the demand of the expanding systems since a sophisticated expert system may require an extremely large number of rules. This leads to machine learning based approach as a solution to manage knowledge-based systems. Although machine learning technique can acquire knowledge from data without the help of a domain expert and an abundance of classifier models exist and © Springer International Publishing Switzerland 2016 R. Booth and M.-L. Zhang (Eds.): PRICAI 2016, LNAI 9810, pp. 80–92, 2016. DOI: 10.1007/978-3-319-42911-3_7
Combining RDR-Based Machine Learning Approach and HEK for PP
81
decision tree based algorithms provide the best performance, over-generalization and over-fitting are still significant problems when sufficient training data are not available so there are not enough patterns which can be found by machine learning. Therefore, large effort usually has to be undertaken to cover t
Data Loading...