A Comparative Analysis of Various Spam Classifications

Bandwidth, time, and storage space are the major three assets in computational world. Spam emails affect all the three, thus degrade the overall efficiency of the system. Spammers are using new tricks and traps to land these frivolous mails into our inbox

PDF / 174,105 Bytes
7 Pages / 439.37 x 666.142 pts Page_size
23 Downloads / 230 Views

DOWNLOAD

REPORT

Abstract Bandwidth, time, and storage space are the major three assets in computational world. Spam emails affect all the three, thus degrade the overall efﬁciency of the system. Spammers are using new tricks and traps to land these frivolous mails into our inbox. To make mailboxes more intelligent, our effort will be to devise a new algorithm that will help to classify emails in much smarter and efﬁcient way. This paper analyzes various spam classiﬁcation techniques and thereby put forward a new way of classifying spam emails. This paper thoroughly compares the results that various authors have got while simulating their architectures. Our approach of classiﬁcation works efﬁciently and more accurately on varied length and type of datasets during training and testing phases. We tried to minimize the error ratio and increase classiﬁer efﬁciency by implementing Genetic Algorithm concept.

⋅

⋅

Keywords Spam classiﬁcation Spam email Unsolicited Logistic regression Genetic algorithm Machine learning

⋅

⋅

⋅

Feature set

⋅

1 Introduction Unsolicited bulk email or junk email are frivolous mails, which are sent in bulk to either make an advertisement [1], proliferate viruses, hack mailboxes [1], cheat somebody, or send a prank. As emails are sent to millions with no incurring cost, the spam trafﬁc between MTA’s causes delayed delivery of true mails [2]. Spams nearly occupy about two-third of our mailboxes [1], thereby causing inefﬁcient utilization of storage space, bandwidth, and time [1]. N.F. Shah (✉) ⋅ P. Kumar Department of Computer Science & Engineering, Birla Institute of Technology, Mesra, Ranchi 835215, India e-mail: [email protected] P. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2018 P.K. Sa et al. (eds.), Progress in Intelligent Computing Techniques: Theory, Practice, and Applications, Advances in Intelligent Systems and Computing 519, DOI 10.1007/978-981-10-3376-6_29

265

266

N.F. Shah and P. Kumar

In order to keep spammers at bay, there are many spam ﬁltering techniques which are robust enough to detect a spam mail. Some of them use knowledge Engg. (KE) based approach, while majority of them are following the machine learning (ML) approach [3]. The latter is more robust and intelligent way of classifying emails. The former uses the stored procedure or rules to classify emails. It may have stored dictionary of words like BUY, SPAM, Lottery, Offer, Prize, Reward, etc. It periodically updates its dictionary to adapt with new trends [4]. But this practice is not so efﬁcient because once the dictionary or repository of words is set, it is impossible to constantly update it at different end-user sites. In comparison to KE, machine-learning approach (ML) is an intelligent way of ﬁltering spams. ML do not have predeﬁned rules or procedure. It can mutate itself to adapt with user needs, so ML is based on user adaptability. Our research will be based on the analytical approaches put forth by various researchers. We will thoroughly analyze their approaches and results, thereby devise

Data Loading...

A Comparative Analysis of Various Spam Classifications

Recommend Documents

Comparative Analysis of Saturation Flow Using Various PCU Estimation Methods

Comparative Analysis of Various Adaptive Filter Structures Using Simulink

Comparative Analysis of Adder for Various CMOS Technologies

Comparative Analysis of Various Techniques Used to Obtain a Suitable Summary of the Document

A comparative study of various hybrid neural networks and regression analysis to predict unconfined compressive strength

Classifications

User Classifications

Discussion: A Comparative Analysis

Classifications

Towards a General Theory of Classifications

Analysis of USDA Food Classifications Using Neural Network Classifier

International Neurolaw A Comparative Analysis