Completion of multiview missing data based on multi-manifold regularised non-negative matrix factorisation

  • PDF / 1,229,508 Bytes
  • 18 Pages / 439.37 x 666.142 pts Page_size
  • 94 Downloads / 172 Views

DOWNLOAD

REPORT


Completion of multiview missing data based on multi‑manifold regularised non‑negative matrix factorisation Sun Jing‑Tao1,2   · Zhang Qiu‑Yu3

© Springer Nature B.V. 2020

Abstract In multi-source data analysis, the absence of data values or attributes is inevitably brought about by various influencing factors including environment, which results in the loss of knowledge to be conveyed by data. To solve the problem of missing data in multi-source data analysis, completion method for multiview missing data based on multi-manifold regularized non-negative matrix factorization was proposed in this paper. This method was based on the assumption of consistency of the multiview data and an algorithm of multimanifold regularized non-negative matrix factorization is adopted to obtain homogeneous manifold and global clustering. On this basis, a multiview synergistic discrimination model is built of the non-missing view that referred to the Gaussian mixture model to pre-mark the clustering that the incremental missing data belonged to. Using the consistency of each view in the low-dimensional space, a prediction model of missing data at the specified view is established using the multiple linear regression technique to achieve accurate data completion under conditions of missing multi-attributes. Through the establishment of data filling model with three handling methods for missing values, namely CMMD-MNMF, FIMUS and Hot deck, the completion performance, clustering performance and classification performance of data sets including UCI, Flower17 and Flower102 are analyzed by simulation experiments. As shown in the results, the method of multi-view data missing completion is verified to be effective. Keywords  Multi-source data analysis · Multiview clustering · Missing data completion · Multi-manifold regularised · Non-negative matrix factorisation

* Sun Jing‑Tao [email protected] 1

School of Computer Science and Technology, Xi’an University of Posts and Telecommunications, Xi’an 710121, Shaanxi, China

2

Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi’an University of Posts and Telecommunications, Xi’an 710121, Shaanxi, China

3

School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, Gansu, China



13

Vol.:(0123456789)



S. Jing‑Tao, Z. Qiu‑Yu

1 Introduction With the rapid development of Internet of things and big data technology, increasingly large and complex data are being collected by various applications. With its 5 V features (namely, volume, velocity, variety, value and veracity), the datum present multi-source and polymorphic features, which provides the possibility for revealing different attributes of things from different perspectives (Liu et al. 2017; Malo et al. 2019). Take news reports as an example. Such data can be obtained from several news websites of different styles that can be written in different languages from different countries, and they can also be videos, audios, pictures, or in other forms. In medical diagnoses, to determine the c