Deep Matching Prior: Test-Time Optimization for Dense Correspondence
Sunghwan Hong
Seungryong Kim

Korea University


[Paper]
[Poster]


Intuition


Intuition of DMP: (a) optimization-based methods [45, 68, 37] that formulate their objective function with data and prior terms,and minimize the energy function on a single image pair, (b) learning-based methods [49, 71, 63] that learn a matching prior from largetraining set of image pairs, and (c) our DMP, which takes the best of both approaches to estimate an image pair-specific matching prior.


Abstract

Conventional techniques to establish dense correspondences across visually or semantically similar images focused on designing a task-specific matching prior, which is difficult to model in general. To overcome this, recent learning-based methods have attempted to learn a good matching prior within a model itself on large training data. The performance improvement was apparent, but the need for sufficient training data and intensive learning hinders their applicability. Moreover, using the fixed model at test time does not account for the fact that a pair of images may require their own prior, thus providing limited performance and poor generalization to unseen images. In this paper, we show that an image pair-specific prior can be captured by solely optimizing the untrained matching networks on an input pair of images. Tailored for such test-time optimization for dense correspondence, we present a residual matching network and a confidence-aware contrastive loss to guarantee a meaningful convergence. Experiments demonstrate that our framework, dubbed Deep Matching Prior (DMP), is competitive, or even outperforms, against the latest learning-based methods on several benchmarks for geometric matching and semantic matching, even though it requires neither large training data nor intensive learning. With the networks pre-trained, DMP attains state-of-the-art performance on all benchmarks.


Overall network architecture


Figure 1. Network configuration and loss function of DMP:Our networks consist of feature extraction and matching networks, whichare formulated in a residual manner to guarantee a good initialization for optimization. Note that a single-level version of the networks areillustrated for brevity, while the full model is formulated in a pyramidal fashion. A confidence-aware contrastive loss enables joint learningof feature extraction and matching networks by rejecting ambiguous matches while accepting confident matches through a thresholding.


Visualization


source

target

DMP

source

target

DMP

source

target

DMP

source

target

DMP

source

target

DMP

source

target

DMP


Experimental Results


Table 1. Quantitative evaluation on HPatches [3] dataset in terms of AEE and PCK.Lower AEE and higher PCK (5-pixel (%)) arebetter.Pre-train: Pre-training, Test-opt.: Test-time optimization.


Figure 2. Qualitative results on the Hpatches benchmark [3]. (a) source and (b) target images, warped source images using corre-spondences of (c) GLU-Net [71], (d) DMP, (e) DMP†, (f) RANSAC-DMP, and (g) Ground-truth. Here, we provide only the sampleswith extremely large geometric variations to compare the outputs produced by each variants and GLU-Net. Note that DMP, starting fromuntrained network, achieves competitive results against GLU-Net trained on a large-scale dataset. Thanks to RANSAC [14], DMP starts the optimization with good initialization, which results RANSAC-DMP producing highly accurate flow fields.


Figure 3. Qualitative results on the PF-PASCAL [19] benchmarks. (a) source image, (b) target image, (c) DMP, (d) A-DMP, (e) DMP†,and (f) DMP†-ResN.


Paper and Supplementary Material

S. Hong, S. Kim
Deep Matching Prior: Test-Time Optimization for Dense Correspondence
In ICCV, 2021.
(hosted on ArXiv)

[Bibtex]


Acknowledgements

This research was supported by the MSIT, Korea, under the ICT Creative Consilience program (IITP-2021-2020-0-01819) and Regional Strategic Industry Convergence Security Core Talent Training Business (IITP-2019-0-01343) supervised by the IITP and National Research Foundation of Korea (NRF-2021R1C1C1006897).