Extracting representative user subset of social networks towards user characteristics and topological features
- PDF / 4,179,253 Bytes
- 29 Pages / 439.642 x 666.49 pts Page_size
- 29 Downloads / 170 Views
Extracting representative user subset of social networks towards user characteristics and topological features Yiming Zhou1 · Yuehui Han1 · An Liu1 · Zhixu Li1 · Hongzhi Yin2 · Wei Chen1 · Lei Zhao1 Received: 22 March 2019 / Revised: 28 May 2020 / Accepted: 1 June 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Extracting a subset of representative users from the original set in social networks plays a critical role in Social Network Analysis. In existing studies, some researchers focus on preserving users’ characteristics when sampling representative users, while others pay attention to preserving the topology structure. However, both users’ characteristics and the network topology contain abundant information of users. Thus, it is critical to preserve both of them while extracting the representative user subset. To achieve the goal, we propose a novel approach in this study, and formulate the problem as RUS (Representative User Subset) problem that is proved as an NP-Hard problem. To solve RUS problem, we propose two approaches KS (K-Selected) and an optimized method (ACS) that are both consisted of a clustering algorithm and a sampling model, where a greedy heuristic algorithm is proposed to solve the sampling model. In addition, we propose the pruning strategy by taking advantage of MaxHeap structure. To validate the performance of the proposed approach, extensive experiments are conducted on two real-world datasets. Results demonstrate that our methods outperform state-of-the-art approaches. Keywords Representative · Social networks · Characteristics · Topological features
1 Introduction In social networks such as Twitter, Facebook, and Sina Weibo, a large number of users post their tweets, spread viewpoints, and share ideas [35]. Due to the diversity of users’ This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2018 Guest Editors: Hakim Hacid, Wojciech Cellary, Hua Wang and Yanchun Zhang Wei Chen
[email protected] Lei Zhao
[email protected]
Extended author information available on the last page of the article.
World Wide Web
characteristics and the largeness of networks, it is intractable and impractical to analyze users and the network in its entirety [17]. Therefore, if a representative user subset containing the most information of the original dataset could be selected, the original dataset will be more “human-readable” and help us analyze users. For example, when conducting surveys and collecting feedbacks in Human-Computer Interaction, it is significant to select representative users, as they have a high representative degree and the number of these users is much smaller than that of the original set [27]. Compared with directly analyzing all users, it will be more effective and efficient to conduct a study on a small subset of representative users that represent behaviors and preferences of the original dataset [8, 37]. Much effort has been devoted to extracting a small subset of users from social net
Data Loading...