Home | Publications | WWM+25

Dream-to-Recon: Monocular 3D Reconstruction With Diffusion-Depth Distillation From Single Images

MCML Authors

Felix Wimbauer

→ Group Daniel Cremers
Computer Vision & Artificial Intelligence

Dominik Muhle

→ Group Daniel Cremers
Computer Vision & Artificial Intelligence

Daniel Cremers

Prof. Dr.

Director

Computer Vision & Artificial Intelligence

Abstract

Volumetric scene reconstruction from a single image is crucial for a broad range of applications like autonomous driving and robotics. Recent volumetric reconstruction methods achieve impressive results, but generally require expensive 3D ground truth or multi–view supervision. We propose to leverage pre–trained 2D diffusion models and depth prediction models to generate synthetic scene geometry from a single image. This can then be used to distill a feed–forward scene reconstruction model. Our experiments on the challenging KITTI–360 and Waymo datasets demonstrate that our method matches or outperforms state–of–the–art baselines that use multi–view supervision, and offers unique advantages, for example regarding dynamic scenes. For more details and code, please check out our project page.

inproceedings WWM+25

ICCV 2025

IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai'i, Oct 19-23, 2025.

Authors

P. Wulff • F. Wimbauer • D. Muhle • D. Cremers

Links

DOI

In Collaboration

SE3 Labs

Research Area

B1 | Computer Vision

BibTeXKey: WWM+25

#p-cremers