Flexible data representation with graph convolution for semi-supervised learning

  • PDF / 937,126 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 64 Downloads / 221 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789().,-volV)

ORIGINAL ARTICLE

Flexible data representation with graph convolution for semi-supervised learning Fadi Dornaika1,2 Received: 24 July 2020 / Accepted: 26 October 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract This paper introduces a scheme for semi-supervised data representation. It proposes a flexible nonlinear embedding model that imitates the principle of spectral graph convolutions. Structured data are exploited in order to determine nonlinear and linear models. The introduced scheme takes advantage of data graphs at two different levels. First, it incorporates manifold regularization that is naturally encoded by the graph itself. Second, the regression model is built on the convolved data samples that are obtained by the joint use of the data and their associated graph. The proposed semi-supervised embedding can tackle challenges related to over-fitting in image data spaces. The proposed graph convolution-based semi-supervised embedding paves the way to new theoretical and application perspectives related to the nonlinear embedding. Indeed, building flexible models that adopt convolved data samples can enhance both the data representation and the final performance of the learning system. Several experiments are conducted on six image datasets for comparing the introduced scheme with many state-of-art semi-supervised approaches. These experimental results show the effectiveness of the introduced data representation scheme. Keywords Graph-based embedding  Semi-supervised learning  Graph convolutions  Discriminant embedding  Pattern recognition

1 Introduction Nowadays, the field of artificial intelligence is increasingly used in many applications that employ digital data. Data can have several types like images, signals, and networks. It is therefore suitable and necessary to address some related issues allowing a machine to assist humans manipulating these data. Many real-world applications can benefit from machine learning tools. For example, class prevalence estimation [14], data clustering [18, 24, 25], and recommendation systems in dynamic contexts [28] can be seen as typical machine learning tools. When data are structured, many learning algorithms can exploit graphs that are associated with the data themselves [28, 52]. Graph-based & Fadi Dornaika [email protected] 1

University of the Basque Country (UPV/EHU), San Sebastia´n, Spain

2

IKERBASQUE, Basque Foundation for Science, Bilbao, Spain

approaches attempt to take advantage of the data structure in discovering the learning model [7, 45, 53]. In addition, these methods also exploit the pairwise similarities between labeled or unlabeled samples allowing better data representation for semi-supervised learning. They also discover the manifold structure of the data in the latent subspace [48, 54] and can reduce the dimensionality of the data. When data are collected, very often the associated labels are not provided (e.g., the semantic labels about a given image is not stored)