Beyond Covariance: SICE and Kernel Based Visual Feature Representation

  • PDF / 1,025,196 Bytes
  • 21 Pages / 595.276 x 790.866 pts Page_size
  • 25 Downloads / 172 Views

DOWNLOAD

REPORT


Beyond Covariance: SICE and Kernel Based Visual Feature Representation Jianjia Zhang1,2 · Lei Wang3

· Luping Zhou4 · Wanqing Li3

Received: 9 May 2019 / Accepted: 21 August 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract The past several years have witnessed increasing research interest on covariance-based feature representation. Originally proposed as a region descriptor, it has now been used as a general representation in various recognition tasks, demonstrating promising performance. However, covariance matrix has some inherent shortcomings such as singularity in the case of small sample, limited capability in modeling complicated feature relationship, and a single, fixed form of representation. To achieve better recognition performance, this paper argues that more capable and flexible symmetric positive definite (SPD)-matrixbased representation shall be explored, and this is attempted in this work by exploiting prior knowledge of data and nonlinear representation. Specifically, to better deal with the issues of small number of feature vectors and high feature dimensionality, we propose to exploit the structure sparsity of visual features and exemplify sparse inverse covariance estimate as a new feature representation. Furthermore, to effectively model complicated feature relationship, we propose to directly compute kernel matrix over feature dimensions, leading to a robust, flexible and open framework of SPD-matrix-based representation. Through theoretical analysis and experimental study, the proposed two representations well demonstrate their advantages over the covariance counterpart in skeletal human action recognition, image set classification and object classification tasks. Keywords Covariance matrix · Structure sparsity · Sparse inverse covariance estimate · Kernel matrix · Visual representation

Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11263-020-01376-1) contains supplementary material, which is available to authorized users.

B

Lei Wang [email protected] https://sites.google.com/view/lei-hs-wang Jianjia Zhang [email protected] Luping Zhou [email protected] Wanqing Li [email protected]

1

School of Biomedical Engineering, Sun Yat-sen University, Shenzhen 518107, Guangdong, China

2

School of Computer Science, University of Technology Sydney, Sydney, NSW 2007, Australia

3

School of Computing and Information Technology, University of Wollongong, Wollongong, NSW 2522, Australia

4

School of Electrical and Information Engineering, The University of Sydney, Sydney, NSW 2006, Australia

1 Introduction As a fundamental mathematical concept, covariance matrix has long been used in all sorts of areas in computer vision. Based on a set of feature vectors, covariance matrix characterises the variance of each feature and the statistical relationship between different features. By applying this property to visual feature representation, a seminal work (Tuzel et al. 2006) published more than one decade ago proposes