A Scalable Data Mining Model for Social Media Influencer Identification

Social network mining is a growing research area which combines together different fields such as machine learning, graph theory, parallel algorithms, data mining, optimization, etc., with the aim of dealing with issues like behavior analysis, finding int

  • PDF / 1,418,598 Bytes
  • 7 Pages / 439.37 x 666.14 pts Page_size
  • 40 Downloads / 234 Views

DOWNLOAD

REPORT


2

Ramrao Adik Institute of Technology, Mumbai, India Department of Computer Engineering, Lokmanya Tilak College of Engineering (Affiliated to University of Mumbai), Navi Mumbai, India [email protected] 3 Department of Computer Engineering, Pillai’s HOC College of Engineering (Affiliated to University of Mumbai), Rasayani, India [email protected]

Abstract. Social network mining is a growing research area which combines together different fields such as machine learning, graph theory, parallel algo‐ rithms, data mining, optimization, etc., with the aim of dealing with issues like behavior analysis, finding interacting groups, finding influencers, information diffusion, etc. in a social network. This paper deals with one of these important issues i.e., Influencer Identification in social networks. This paper presents a data mining modelling approach for a twitter network, to find the most influential user among the given pair of users. This could be scaled over the entire network. We used a data mining model to score the test data and predict the influential user among the given pair of users. This approach of modeling can potentially be used for building many of the marketing and sales strategies wherein the influencer may be motivated for diffusing information or new ideas. Keywords: Data mining · Influencers · Social network mining · Decision tree · Logistic regression

1

Introduction

Data mining is the process of studying data having different hidden behaviour, analyzing the patterns and deploying it to produce significant information. This information further can be used for carrying out predictive analytics and descriptive analytics. To construct a data mining model, we need to uncover the characteristics of dataset, create a model and deploy it [1]. Social network analysis [SNA] is the process of analyzing the behavior and interac‐ tions between different entities in the social networks. SNA has a great potential to eval‐ uate the issues like, the likelihood of a particular community to grow, probability that the node gets influenced by other node, probability that a node acts as an influencer, etc. [1, 2]. The main aim of SNA is to explain the dependencies between the attributes of related nodes and predict the attributes like link probabilities, node behaviour, etc. in a given social network [1]. © Springer Nature Singapore Pte Ltd. 2016 A. Unal et al. (Eds.): SmartCom 2016, CCIS 628, pp. 625–631, 2016. DOI: 10.1007/978-981-10-3433-6_75

626

J. Sunil More and C. Lingam

In this context, social influence can be defined in terms of conformity or the act of manipulating attitudes and behaviors, or dominance on peer group. Three broad varieties of social influence have been identified by Herbert Kelmen [3] as: Compliance, Identi‐ fication and Internalization. Social Media Influencer Marketing [4] is the strategy of identifying the influencing people and motivating them to form new customer pool for the owners. It is irrespective of the size of pool of the audience of the influencer. The influencer can reach to consumers more e