FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence
- 주제(키워드) Dense semantic correspondence , convolutional neural networks , self-similarity , weakly-supervised learning
- 주제(기타) Computer Science, Artificial Intelligence; Engineering, Electrical & Electronic
- 설명문(일반) [Kim, Seungryong; Ham, Bumsub; Sohn, Kwanghoon] Yonsei Univ, Sch Elect & Elect Engn, Seoul 120749, South Korea; [Min, Dongbo] Ewha Womans Univ, Dept Comp Sci & Engn, Seoul 03760, South Korea; [Lin, Stephen] Microsoft Res, Beijing 100080, Peoples R China
- 관리정보기술 faculty
- 등재 SCIE, SCOPUS
- OA유형 Green Submitted
- 발행기관 IEEE COMPUTER SOC
- 발행년도 2019
- URI http://www.dcollection.net/handler/ewha/000000160524
- 본문언어 영어
- Published As http://dx.doi.org/10.1109/TPAMI.2018.2803169
- PubMed https://pubmed.ncbi.nlm.nih.gov/29993476
초록/요약
We present a descriptor, called fully convolutional self-similarity (FCSS), for dense semantic correspondence. Unlike traditional dense correspondence approaches for estimating depth or optical flow, semantic correspondence estimation poses additional challenges due to intra-class appearance and shape variations among different instances within the same object or scene category. To robustly match points across semantically similar images, we formulate FCSS using local self-similarity (LSS), which is inherently insensitive to intra-class appearance variations. LSS is incorporated through a proposed convolutional self-similarity (CSS) layer, where the sampling patterns and the self-similarity measure are jointly learned in an end-to-end and multi-scale manner. Furthermore, to address shape variations among different object instances, we propose a convolutional affine transformer (CAT) layer that estimates explicit affine transformation fields at each pixel to transform the sampling patterns and corresponding receptive fields. As training data for semantic correspondence is rather limited, we propose to leverage object candidate priors provided in most existing datasets and also correspondence consistency between object pairs to enable weakly-supervised learning. Experiments demonstrate that FCSS significantly outperforms conventional handcrafted descriptors and CNN-based descriptors on various benchmarks.
more