A Social Spam Detection Framework via Semi-supervised Learning

With the increasing popularity of social networking websites such as Twitter, Facebook, Sina Weibo and MySpace, spammers on them are getting more and more rampant. Social spammers always create a mass of compromised or fake accounts to deceive users and l

PDF / 886,475 Bytes
13 Pages / 439.37 x 666.142 pts Page_size
69 Downloads / 269 Views

DOWNLOAD

REPORT

Abstract. With the increasing popularity of social networking websites such as Twitter, Facebook, Sina Weibo and MySpace, spammers on them are getting more and more rampant. Social spammers always create a mass of compromised or fake accounts to deceive users and lead them to access malicious websites which contain illegal, pornography or dangerous information. As we all know, most of the studies on social spam detection are based on supervised machine learning which requires plenty of annotated datasets. Unfortunately, labeling a large number of datasets manually is a complex, error-prone and tedious task which may costs a lot of human eﬀorts and time. In this paper, we propose a novel semi-supervised classiﬁcation framework for social spam detection, which combines co-training with k-medoids. First we utilize k-medoids clustering algorithm to acquire some informative and presentative samples for labelling as our initial seeds set. Then we take advantage of the content features and behavior features of users for our co-training classiﬁcation framework. In order to illustrate the eﬀectiveness of k-medoids, we compare the performance with random selecting strategy. Finally, we evaluate the eﬀectiveness of our proposed detection framework compared with several classical supervised algorithms.

Keywords: Semi-supervised learning k-medoids

1

·

Social spam

·

Co-training

·

Introduction

Social networking websites such as Twitter, Facebook, Sina Weibo and MySpace have gained more and more popularity and attentions around the world in recent years. Twitter, as a microblogging site, is the fastest growing one than any other social networking site and allows users to post their latest updates and share messages using no more than 140 characters, known as tweets. Users could communicate and stay in touch with their friends through the exchange of tweets. Accompanied by the popularity of social networks, spam as an indelible byproduct threatens users and social network websites with diﬀerent forms and This work was supported by National Science Foundation of China (No. 61272374, 61300190) and 863 Project (No. 2015AA015463). c Springer International Publishing Switzerland 2016 H. Cao et al. (Eds.): PAKDD 2016 Workshops, LNAI 9794, pp. 214–226, 2016. DOI: 10.1007/978-3-319-42996-0 18

A Social Spam Detection Framework via Semi-supervised Learning

215

deﬁnitions. For example, spammers often use Twitter as a tool to post malicious links, send unsolicited messages to legitimate users, and hijack trending topics. Social spammers always create a mass of compromised or fake accounts to deceive users and lead them to access malicious websites which contain much illegal, pornography or dangerous information. Sometimes, the spammers disguise themselves as normal users and imitate the behavior of legitimate users. This kind of spammers are hardly to be discovered and also very harmful to legitimate users and social networks. As a countermeasure, Twitter has its own detection methods and rules against spam and abuse. Users who

Data Loading...

A Social Spam Detection Framework via Semi-supervised Learning

Recommend Documents

A Study of Spam Detection Algorithm on Social Media Networks

Web Spam Detection

Context-dependent model for spam detection on social networks

Spam Detection on Arabic Twitter

Fighting spam using social GateKeepers

A Reinforcement Learning Based Cognitive Empathy Framework for Social Robots

A Deep Learning Based Framework for Distracted Driver Detection

Subconscious Learning via Games and Social Media

SMS Spam Filtering Using Machine Learning Technique

Detection of SPAM Attacks in the Remote Triggered WSN Experiments

Review Spam Detection Based on Multi-dimensional Features

Efficient Prevention Mechanism Against Spam Attacks for Social Networking Sites