A heuristic technique to detect phishing websites using TWSVM classifier

  • PDF / 1,618,424 Bytes
  • 20 Pages / 595.276 x 790.866 pts Page_size
  • 39 Downloads / 293 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789().,-volV)

ORIGINAL ARTICLE

A heuristic technique to detect phishing websites using TWSVM classifier Routhu Srinivasa Rao1



Alwyn Roshan Pais2 • Pritam Anand3

Received: 22 January 2019 / Accepted: 8 September 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract Phishing websites are on the rise and are hosted on compromised domains such that legitimate behavior is embedded into the designed phishing site to overcome the detection. The traditional heuristic techniques using HTTPS, search engine, Page Ranking and WHOIS information may fail in detecting phishing sites hosted on the compromised domain. Moreover, list-based techniques fail to detect phishing sites when the target website is not in the whitelisted data. In this paper, we propose a novel heuristic technique using TWSVM to detect malicious registered phishing sites and also sites which are hosted on compromised servers, to overcome the aforementioned limitations. Our technique detects the phishing websites hosted on compromised domains by comparing the log-in page and home page of the visiting website. The hyperlink and URL-based features are used to detect phishing sites which are maliciously registered. We have used different versions of support vector machines (SVMs) for the classification of phishing websites. We found that twin support vector machine classifier (TWSVM) outperformed the other versions with a significant accuracy of 98.05% and recall of 98.33%. Keywords Phishing  Anti-phishing  Compromised server  Heuristic  SVM  TWSVM

1 Introduction Phishing is a kind of attack which deceives the online users with a fake website imitating the trusted website. Once the fake site is visited, then the script embedded in the fake website may steal the sensitive information such as user

& Routhu Srinivasa Rao [email protected] Alwyn Roshan Pais [email protected] Pritam Anand [email protected] 1

Department of CSE, GMR Institute of Technology, Rajam, Andhra Pradesh 532127, India

2

Information Security Research Lab, Department of Computer Science and Engineering, National Institute of Technology, Surathkal, Karnataka 575025, India

3

Faculty of Mathematics and Computer Science, South Asian University, New Delhi 110021, India

name, password, credit card or bank account number, etc. Nowadays, payment services, financial transactions, webmail, etc. are mostly conducted online; therefore, they are mostly targeted by the phishers. Phishing is on the rise in the recent years and is very hard to defend against it as it explores the weakness of human mind. According to the APWG 2016 fourthquarter report [3], 1,220,523 number of phishing attacks were recorded in 2016 which have been confirmed to be the highest than in any year since it began monitoring in 2004. RSA 2013 online fraud report [43] estimates a loss of over USD $5.9 billion with 450,000 phishing attacks. Another report from Kaspersky Lab revealed that their anti-phishing system was triggered 154,957,897 times on th