Combining RDR-Based Machine Learning Approach and Human Expert Knowledge for Phishing Prediction

Detecting phishing websites has been noted as complex and dynamic problem area because of the subjective considerations and ambiguities of detection mechanism. We propose a novel approach that uses Ripple-down Rule (RDR) to acquire knowledge from human ex

PDF / 1,430,947 Bytes
13 Pages / 439.37 x 666.14 pts Page_size
81 Downloads / 232 Views

DOWNLOAD

REPORT

)

School of Engineering and ICT, Hobart, TAS 7005, Australia {David.Chung,renjiec,Soyeon.Han,Byeong.Kang}@utas.edu.au

Abstract. Detecting phishing websites has been noted as complex and dynamic problem area because of the subjective considerations and ambiguities of detec‐ tion mechanism. We propose a novel approach that uses Ripple-down Rule (RDR) to acquire knowledge from human experts with the modiﬁed RDR model-gener‐ ating algorithm (Induct RDR), which applies machine-learning approach. The modiﬁed algorithm considers two diﬀerent data types (numeric and nominal) and also applies information theory from decision tree learning algorithms. Our experimental results showed the proposing approach can help to deduct the cost of solving over-generalization and over-ﬁtting problems of machine learning approach. Three models were included in comparison: RDR with machine learning and human knowledge, RDR machine learning only and J48 machine learning only. The result shows the improvements in prediction accuracy of the knowledge acquired by machine learning. Keywords: Phishing prediction, RDR · Knowledge-based system · Machine learning · Decision tree

1

Introduction

An accelerative growth of Internet-based ﬁnancing increases online fraudulent activity in which malicious people tries to reveal sensitive information of Internet users, also called as phishing. Phishing detection has received great attention but there has been limited research on a way of overall success due to the nature of problems. The problems of detecting phishing websites are very complex and hard to analyze as technical and social problems are joining each other [1]. Either machine learning technique and human expert system has been applied to acquire and maintain the knowledge for phishing website detection and prediction while the results do not show signiﬁcance. A large number of knowledge-based systems are built for acquiring and maintaining the knowl‐ edge for detecting and predicting the phishing website. Phishing website detection knowledge was originally acquired from domain experts. However, acquiring knowl‐ edge from an expert in a slow pace cannot meet the demand of the expanding systems since a sophisticated expert system may require an extremely large number of rules. This leads to machine learning based approach as a solution to manage knowledge-based systems. Although machine learning technique can acquire knowledge from data without the help of a domain expert and an abundance of classiﬁer models exist and © Springer International Publishing Switzerland 2016 R. Booth and M.-L. Zhang (Eds.): PRICAI 2016, LNAI 9810, pp. 80–92, 2016. DOI: 10.1007/978-3-319-42911-3_7

Combining RDR-Based Machine Learning Approach and HEK for PP

81

decision tree based algorithms provide the best performance, over-generalization and over-ﬁtting are still signiﬁcant problems when suﬃcient training data are not available so there are not enough patterns which can be found by machine learning. Therefore, large eﬀort usually has to be undertaken to cover t

Data Loading...

Combining RDR-Based Machine Learning Approach and Human Expert Knowledge for Phishing Prediction

Recommend Documents

Combining Expert Knowledge with NLP for Specialised Applications

Phishing URL Detection Using Machine Learning Techniques

Machine Learning Configurations for Enhanced Human Protein Function Prediction Accuracy

Phishing Website Detection Using Machine Learning

Machine Learning Techniques for the Investigation of Phishing Websites

Machine Learning Approach for Student Academic Performance Prediction

Impact of Current Phishing Strategies in Machine Learning Models for Phishing Detection

Combining Process Mining and Machine Learning for Lead Time Prediction in High Variance Processes

Combining Machine and Automata Learning for Network Traffic Classification

Human body skin temperature prediction based on machine learning

Machine Learning for Microbial Phenotype Prediction

Machine Learning Preprocessing Method for Suicide Prediction