Non-convex low-rank representation combined with rank-one matrix sum for subspace clustering

  • PDF / 417,669 Bytes
  • 10 Pages / 595.276 x 790.866 pts Page_size
  • 79 Downloads / 180 Views

DOWNLOAD

REPORT


METHODOLOGIES AND APPLICATION

Non-convex low-rank representation combined with rank-one matrix sum for subspace clustering Xiaofang Liu2 · Jun Wang1 · Dansong Cheng1 · Daming Shi3 · Yongqiang Zhang1

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Exploring the multiple subspace structures of data such as low-rank representation is effective in subspace clustering. Non-convex low-rank representation (NLRR) via matrix factorization is one of the state-of-the-art techniques for subspace clustering. However, NLRR cannot scale to problems with large n (number of samples) as it requires either the inversion of an n × n matrix or solving an n × n linear system. To address this issue, we propose a novel approach, NLRR++, which reformulates NLRR as a sum of rank-one components, and apply a column-wise block coordinate descent to update each component iteratively. NLRR++ reduces the time complexity per iteration from O(n 3 ) to O(mnd) and the memory complexity from O(n 2 ) to O(mn), where m is the dimensionality and d is the target rank (usually d  m  n). Our experimental results on simulations and real datasets have shown the efficiency and effectiveness of NLRR++. We demonstrate that NLRR++ is not only much faster than NLRR, but also scalable to large datasets such as the ImageNet dataset with 120K samples. Keywords Subspace clustering · Non-convex low-rank representation · Block coordinate descent · Rank-one matrix

1 Introduction Many data mining and machine learning applications involve high-dimensional data, such as images, videos and documents (Wang et al. 2019; Zhang et al. 2019; Tang and Wei 2019; Ding et al. 2018; Fan et al. 2016; Du et al. 2017, 2018). In practice, such high-dimensional datasets can often be well approximated by multiple low-dimensional subspaces corresponding to multiple classes or categories (Fahmi et al. 2018; Wang et al. 2018; Amin et al. 2019; Tang et al. 2019). In the past decades, the subspace clustering as one of the most important machine learning technologies has been widely Communicated by V. Loia.

B

Dansong Cheng [email protected]

1

School of Computer Science and Technology, Harbin Institute Technology, 92 West Dazhi Street, Nan Gang District, Harbin, People’s Republic of China

2

School of Electrical Engineering and Automation, Harbin Institute of Technology, 92 West Dazhi Street, Nan Gang District, Harbin, People’s Republic of China

3

College of computer and software, Shenzhen University, 3688 Nanhai Ave, Nanshan District, Shenzhen, People’s Republic of China

used in computer vision (Wang et al. 2016; Zhou et al. 2013) and machine learning (Bian et al. 2017; Du et al. 2017, 2016; Jia et al. 2017; Cheng et al. 2013). For instance, low-rank representation (LRR) (Liu et al. 2013) and sparse subspace clustering (SSC) (Elhamifar and Vidal 2013) aim to obtain a low-rank structure of multiple subspaces to fit high-dimensional data. However, both LRR and SSC require O(n 3 ) time complexity, so they are not able to scale to problems with large n (number of