Diffusion logistic regression algorithms over multiagent networks
- PDF / 265,588 Bytes
- 8 Pages / 595.28 x 841.89 pts (A4) Page_size
- 90 Downloads / 245 Views
Control Theory and Technology http://link.springer.com/journal/11768
Diffusion logistic regression algorithms over multiagent networks Yan DU 1 , Lijuan JIA 1† , Shunshoku KANAE 2 , Zijiang YANG 3 1.School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China; 2.Department of Medical Engineering, Faculty of Health Science, Junshin Gakune University, Fukuoka, Japan; 3.Department of Intelligent Systems Engineering, Ibaraki University, 4-12-1 Nakanarusawa, Hitachi, Ibaraki 316-8511, Japan Received 10 January 2020; revised 28 April 2020; accepted 28 April 2020
Abstract In this paper, a distributed scheme is proposed for ensemble learning method of bagging, which aims to address the classification problems for large dataset by developing a group of cooperative logistic regression learners in a connected network. Moveover, each weak learner/agent can share the local weight vector with its immediate neighbors through diffusion strategy in a fully distributed manner. Our diffusion logistic regression algorithms can effectively avoid overfitting and obtain high classification accuracy compared to the non-cooperation mode. Furthermore, simulations with a real dataset are given to demonstrate the effectiveness of the proposed methods in comparison with the centralized one. Keywords: Logistic regression, bagging, diffusion strategy, connected network DOI https://doi.org/10.1007/s11768-020-0009-2
1 Introduction Logistic regression (LR), as a classic machine learning technique, has been extensively applied to classification problem of medical diagnosis [1, 2] and disease prediction [3, 4] for decades. In many practical applications, due to the large size of the dataset, the learning method may be of large computational complexity, which can result in low computational efficiency. Therefore, effort has been made to consider creating an ensemble of mul-
tiple weak learners to get a better global learner. Here, weak means that it is not necessary for the learner to possess a high accuracy and can be completed quickly. Moreover, the overall learning can be achieved by a parallel processing method. Several solutions have been proposed to implement parallel processing of multiple learners under large dataset. In [5], a distribution preserving kernel support vector machine (DiP-SVM) method is proposed to solve
† Corresponding author. E-mail: [email protected]. This work was supported in part by the National Natural Science foundation of China (No. 41927801).
© 2020 South China University of Technology, Academy of Mathematics and Systems Science, CAS and Springer-Verlag GmbH Germany, part of Springer Nature
161
Y. Du et al. / Control Theory Tech, Vol. 18, No. 2, pp. 160–167, May 2020
classification problems for big data, which splits the dataset into manageable sized “partitions” and trains the support vector machine on each of these partitions separately to obtain local support vectors. The global support vector is obtained by processing the partition support vectors in parallel. Another parallel s
Data Loading...