RC-MVSNet

RC-MVSNet: Unsupervised Multi-View Stereo
with Neural Rendering
ECCV 2022

Sabine Süsstrunk²
Matthias Nießner¹

¹Technische Universität München
²École Polytechnique Fédérale de Lausanne
³The Hong Kong University of Science and Technology

Abstract

Finding accurate correspondences among different views is the Achilles' heel of unsupervised Multi-View Stereo (MVS). Existing methods are built upon the assumption that corresponding pixels share similar photometric features. However, multi-view images in real scenarios observe non-Lambertian surfaces and experience occlusions. In this work, we propose a novel approach with neural rendering (RC-MVSNet) to solve such ambiguity issues of correspondences among views. Specifically, we impose a depth rendering consistency loss to constrain the geometry features close to the object surface to alleviate occlusions. Concurrently, we introduce a reference view synthesis loss to generate consistent supervision, even for non-Lambertian surfaces. Extensive experiments on DTU and Tanks&Temples benchmarks demonstrate that our approach achieves state-of-the-art performance over unsupervised MVS frameworks and competitive performance to many supervised methods. Our code will be released at RC-MVSNet.

Video

Proposed Method

Overview of our RC-MVSNet. a) Unsupervised Backbone CasMVSNet predicts initial depth map by photometric consistency and provides depth priors for rendering consistency network. b) Rendering Consistency Network generates image and depth by neural rendering under the guidance of depth priors. c) The rendered image is supervised by the reference view synthesis loss. d) The rendered depth is supervised by the depth rendering consistency loss.

Motivation

In the real-world environment occlusions, reflecting, non-Lambertian surfaces, varying camera exposure, and other variables will make photometric consistency assumption invalid.

Results on DTU

Point Cloud Reconstruction

Depth map Inference

Results on Tanksandtemples

Citation

@inproceedings{chang2022rc,
title={RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering},
author={Chang, Di and Bo{\v{z}}i{\v{c}}, Alja{\v{z}} and Zhang, Tong and Yan, Qingsong and Chen, Yingcong and S{\"u}sstrunk, Sabine and Nie{\ss}ner, Matthias},
booktitle={Proceedings of the European conference on computer vision (ECCV)},
year={2022}
}

Acknowledgements

This work is completed during Di's Guided Research Praktikum at Visual Computing & Artificial Intelligence Group(TUM) directed by Prof. Matthias Nießner and Summer@EPFL Internship at Image and Visual Representation Lab(EPFL) directed by Prof. Sabine Süsstrunk. The project is funded by the ERC Starting Grant Scan2CAD (804724), a TUM-IAS Rudolf Mößbauer Fellowship, and the German Research Foundation (DFG) Grant Making Machine Learning on Static and Dynamic 3D Data Practical.
The website template was borrowed from Zhenxing Mi GBi-Net.

RC-MVSNet: Unsupervised Multi-View Stereo
with Neural Rendering
ECCV 2022

Arxiv

Video

Code

Abstract

Video

Proposed Method

Motivation

Results on DTU

Point Cloud Reconstruction

Depth map Inference

Results on Tanksandtemples

Citation

Acknowledgements

RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering ECCV 2022

Arxiv

Video

Code

Abstract

Video

Proposed Method

Motivation

Results on DTU

Point Cloud Reconstruction

Depth map Inference

Results on Tanksandtemples

Citation

Acknowledgements

RC-MVSNet: Unsupervised Multi-View Stereo
with Neural Rendering
ECCV 2022