Discovering Opinion Spammer Groups by Network Footprints

Online reviews are an important source for consumers to evaluate products/services on the Internet (e.g. Amazon, Yelp, etc.). However, more and more fraudulent reviewers write fake reviews to mislead users. To maximize their impact and share effort, many

PDF / 910,697 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
76 Downloads / 217 Views

DOWNLOAD

REPORT

Abstract. Online reviews are an important source for consumers to evaluate products/services on the Internet (e.g. Amazon, Yelp, etc.). However, more and more fraudulent reviewers write fake reviews to mislead users. To maximize their impact and share eﬀort, many spam attacks are organized as campaigns, by a group of spammers. In this paper, we propose a new two-step method to discover spammer groups and their targeted products. First, we introduce NFS (Network Footprint Score), a new measure that quantiﬁes the likelihood of products being spam campaign targets. Second, we carefully devise GroupStrainer to cluster spammers on a 2-hop subgraph induced by top ranking products. We demonstrate the eﬃciency and eﬀectiveness of our approach on both synthetic and real-world datasets from two diﬀerent domains with millions of products and reviewers. Moreover, we discover interesting strategies that spammers employ through case studies of our detected groups. Keywords: Opinion spam · Spammer groups · Spam detection · Graph anomaly detection · Eﬃcient hierarchical clustering · Network footprints

1

Introduction

Online reviews of products and services are an increasingly important source of information for consumers. They are valuable since, unlike advertisements, they reﬂect the testimonials of other, “real” consumers. While many positive reviews can increase the revenue of a business, negative reviews can cause substantial loss. As a result of such ﬁnancial incentives, opinion spam has become a critical issue [17], where fraudulent reviewers fabricate spam reviews to unjustly promote or demote (e.g., under competition) certain products and businesses. Opinion spam is surprisingly prevalent; one-third of consumer reviews on the Internet1 , and more than 20% of reviews on Yelp2 are estimated to be fake. Despite being widespread, opinion spam remains a mostly open and challenging problem for at least two main reasons; (1) humans are incapable of distinguishing fake reviews based on text [25], which renders manual labeling extremely diﬃcult 1 2

http://www.nytimes.com/2012/08/26/business/book-reviewers-for-hire-meet-ademand-for-online-raves.html http://www.businessinsider.com/20-percent-of-yelp-reviews-fake-2013-9

c Springer International Publishing Switzerland 2015 A. Appice et al. (Eds.): ECML PKDD 2015, Part I, LNAI 9284, pp. 267–282, 2015. DOI: 10.1007/978-3-319-23528-8 17

268

J. Ye and L. Akoglu

and hence supervised methods inapplicable, and (2) fraudulent reviewers are often professionals, paid by businesses to write detailed and genuine-looking reviews. Since the seminal work by Jindal and Liu [17], opinion spam has been the focus of research for the last 7-8 years (Section 5). Most existing work aim to detect individual spam reviews [12,17,20,21,25,26] or spammers [1,11,13, 18,22,26]. However, fraud/spam is often a collective act, where the involved individuals cooperate in groups to execute spam campaigns. This way, they can increase total impact (i.e., dominate the sentiments towards target products via ﬂoodi

Data Loading...

Discovering Opinion Spammer Groups by Network Footprints

Recommend Documents

Discovering Audience Groups and Group-Specific Influencers

Comparing Cases and Groups, Discovering Interrelations, and Using Visualizations

Social Influence of Competing Groups and Leaders in Opinion Dynamics

Discovering Space

Opinion Transmission Network for Jointly Improving Aspect-Oriented Opinion Words Extraction and Sentiment Classification

Discovering Activity Patterns in the City by Social Media Network Data: a Case Study of Istanbul

Ileret Footprints

Smart Approach for Discovering Gateways in Mobile Ad Hoc Network

Environmental Footprints of Packaging

Opinion

Environmental Footprints Assessing Anthropogenic Effects

Groups Generated by Reflections