Analyzing temporal patterns of topic diversity using graph clustering
- PDF / 1,659,913 Bytes
- 14 Pages / 439.37 x 666.142 pts Page_size
- 102 Downloads / 212 Views
Analyzing temporal patterns of topic diversity using graph clustering Takako Hashimoto1 · David Lawrence Shepard2 · Tetsuji Kuboyama3 · Kilho Shin3 · Ryota Kobayashi4,5 · Takeaki Uno6 Accepted: 11 September 2020 © The Author(s) 2020
Abstract During a disaster, social media can be both a source of help and of danger: Social media has a potential to diffuse rumors, and officials involved in disaster mitigation must react quickly to the spread of rumor on social media. In this paper, we investigate how topic diversity (i.e., homogeneity of opinions in a topic) depends on the truthfulness of a topic (whether it is a rumor or a non-rumor) and how the topic diversity changes in time after a disaster. To do so, we develop a method for quantifying the topic diversity of the tweet data based on text content. The proposed method is based on clustering a tweet graph using Data polishing that automatically determines the number of subtopics. We perform a case study of tweets posted after the East Japan Great Earthquake on March 11, 2011. We find that rumor topics exhibit more homogeneity of opinions in a topic during diffusion than non-rumor topics. Furthermore, we evaluate the performance of our method and demonstrate its improvement on the runtime for data processing over existing methods. Keywords Social media analysis · Topic extraction · Graph clustering · Community detection · Data polishing
1 Introduction After the East Japan Great Earthquake on 11 March, 2011, Twitter users reacted quickly and discussed a variety of topics both real and imaginary. An example is the rumor about an explosion at a petrochemical complex owned by Cosmo Oil Co., Ltd. Stories of oil tanks exploding and releasing harmful substances into the air caused widespread panic until official government announcement released on the following day. Social media has the potential to be a source of both help and trouble during
* Ryota Kobayashi r‑[email protected]‑tokyo.ac.jp Extended author information available on the last page of the article
13
Vol.:(0123456789)
T. Hashimoto et al.
disasters. Constructing strategies for disaster mitigation requires addressing issues that arise from social media as well. Analysis and modeling of popularity dynamics of an online content has been an active area of research [1–11]. A popular method for extracting a topic is to collect all the tweets that mentioned a specific word (keyword) or hashtag and analyze the temporal patterns [1, 2, 7, 10]. While this approach makes it easy to extract the emergence of topics, we can often identify various “sub-topics” within a topic intuitively. The diversity of the content may vary greatly depending on topics. We focus on a subtopic obtained by clustering the extracted tweet data related to a keyword (i.e., a topic). We study the topic diversity defined as the number of subtopics in a topic (discussed in more detail in Sect. 3.3). The topic models including Latent Dirichlet Allocation (LDA) [12] are popular method for discovering an abstract “topic” from a documents and s
Data Loading...