A semi-supervised model for Persian rumor verification based on content information

  • PDF / 3,203,010 Bytes
  • 29 Pages / 439.37 x 666.142 pts Page_size
  • 94 Downloads / 188 Views

DOWNLOAD

REPORT


A semi-supervised model for Persian rumor verification based on content information Zoleikha Jahanbakhsh-Nagadeh 1 & Mohammad-Reza Feizi-Derakhshi 2 Arash Sharifi 1

&

Received: 20 February 2020 / Revised: 2 September 2020 / Accepted: 13 October 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

Rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. In social networks, false-rumors may have significantly different contextual characteristics from true-rumors at lexical, syntactic, semantic levels. Therefore, this study presents the BERT-SAWS semi-supervised learning model for early verification of Persian rumor by investigating content-based and context features at three views: Contextual Word Embeddings (CWE), speech act, and Writing Style (WS). This model is built by loading pre-trained Bidirectional Encoder Representations from Transformers (BERT) as an unsupervised language representation, fine-tuning it using a small Persian rumor dataset, and combining with a supervised learning model to provide an enriched text representation of the content of the rumor. This text representation enables the model to have a better comprehending of the rumor language to verify rumors better than baseline models for two reasons: (i) early rumor verification by focusing on contentbased and context-based features of the source rumor. (ii) overcoming the problem of the shortcoming of the dataset in deep neural networks by loading pre-trained BERT, finetuning it using the Persian rumor dataset, and combining with speech act and WS-based features. The empirical results of applying the model on Twitter and Telegram datasets demonstrated that BERT-SAWS can enhance the performance of the classifier from 2% to 18%. It indicates that speech act and WS alongside semantic contextual vectors are helpful features in the rumor verification task. Keywords Rumor verification . BERT . Speech act . Writing style . Persian rumor classification . Contextual features . Neural language model . Natural language processing

* Mohammad-Reza Feizi-Derakhshi [email protected]

1

Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran

2

ComInSys Laboratory, Department of Computer Engineering, University of Tabriz, Tabriz, Iran

Multimedia Tools and Applications

1 Introduction Nowadays, social networks have become the main source of access to the news among people. Since anyone using smartphones can broadcast the events of his environment at the same time in the form of text, pictures or videos; so before the presence of television cameras on the scene of the event, the news is broadcast in cyberspace. These posts are called rumor. The term rumor as much as is negative is attractive too; so that any listener involuntarily enters the spreading stream of rumors. It’s enough to be an attractive topic or a hot topic with a deceptive title; in this case, a huge amount of tweets, or Viber messages and Telegram posts will be pu