Deep Self-correlation Descriptor for Dense Cross-Modal Correspondence

We present a novel descriptor, called deep self-correlation (DSC), designed for establishing dense correspondences between images taken under different imaging modalities, such as different spectral ranges or lighting conditions. Motivated by local self-s

PDF / 3,411,584 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
56 Downloads / 228 Views

DOWNLOAD

REPORT

2

Yonsei University, Seoul, South Korea {srkim89,khsohn}@yonsei.ac.kr Chungnam National University, Daejeon, South Korea [email protected] 3 Microsoft Research, Beijing, China [email protected]

Abstract. We present a novel descriptor, called deep self-correlation (DSC), designed for establishing dense correspondences between images taken under diﬀerent imaging modalities, such as diﬀerent spectral ranges or lighting conditions. Motivated by local self-similarity (LSS), we formulate a novel descriptor by leveraging LSS in a deep architecture, leading to better discriminative power and greater robustness to non-rigid image deformations than state-of-the-art descriptors. The DSC ﬁrst computes self-correlation surfaces over a local support window for randomly sampled patches, and then builds hierarchical self-correlation surfaces by performing an average pooling within a deep architecture. Finally, the feature responses on the self-correlation surfaces are encoded through a spatial pyramid pooling in a circular conﬁguration. In contrast to convolutional neural networks (CNNs) based descriptors, the DSC is trainingfree, is robust to cross-modal imaging, and can be densely computed in an eﬃcient manner that signiﬁcantly reduces computational redundancy. The state-of-the-art performance of DSC on challenging cases of cross-modal image pairs is demonstrated through extensive experiments. Keywords: Cross-modal correspondence · Deep architecture correlation · Local self-similarity · Non-rigid deformation

1

·

Self-

Introduction

In many computer vision and computational photography applications, images captured under diﬀerent imaging modalities are used to supplement the data provided in color images. Typical examples of other imaging modalities include near-infrared [1–3] and dark ﬂash [4] photography. More broadly, photos taken under diﬀerent imaging conditions, such as diﬀerent exposure settings [5], blur levels [6,7], and illumination [8], can also be considered as cross-modal [9,10]. Establishing dense correspondences between cross-modal image pairs is essential for combining their disparate information. Although powerful global optimizers may help to improve the accuracy of correspondence estimation to some This work was done while Seungryong Kim was an intern at Microsoft Research. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part VIII, LNCS 9912, pp. 679–695, 2016. DOI: 10.1007/978-3-319-46484-8 41

S. Kim et al.

Ground truth→ −15 −10

−5

0

5

search range

10

15

SIFT CNN DASC DSCN

Matching cost

SIFT CNN DASC DSCN

Matching cost

Matching cost

680

Ground truth→

−15 −10

−5

0

5

search range

10

15

SIFT CNN DASC DSCN Ground truth→

−15 −10

−5

0

5

search range

10

15

Fig. 1. Examples of matching cost proﬁles, computed with diﬀerent descriptors along the scan lines of A, B, and C for image pairs under severe non-rigid deformations and illumination changes. Unlike other descriptors, DSC yields reliable global minima.

extent [11,12], they face inherent limi

Data Loading...

Deep Self-correlation Descriptor for Dense Cross-Modal Correspondence

Recommend Documents

Fully Automated and Highly Accurate Dense Correspondence for Facial Surfaces

Spatio-Temporally Consistent Correspondence for Dense Dynamic Scene Modeling

Automatic Tooth Segmentation and Dense Correspondence of 3D Dental Model

Advancing Dense Stereo Correspondence with the Infection Algorithm

Deep Spatio-Temporal Dense Network for Regional Pollution Prediction

Correspondence

Correspondence

Correspondence

Crossmodal associations modulate multisensory spatial integration

Correspondence Scrolls

Event-Triggered Control for Linear Descriptor Systems

Bionic Vision Descriptor for Image Retrieval