10.06.2025
MCML at CVPR 2025
35 Accepted Papers (29 Main, and 6 Workshops)
IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, Jun 11-15, 2025
We are happy to announce that MCML researchers have contributed a total of 35 papers to CVPR 2025: 29 Main, and 6 Workshop papers. Congrats to our researchers!
Main Track (29 papers)
S. A. Baumann • F. Krause • M. Neumayr • N. Stracke • M. Sevi • V. T. Hu • B. Ommer
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Q. Bouniot • I. Redko • A. Mallasto • C. Laclau • O. Struckmeier • K. Arndt • M. Heinonen • V. Kyrki • S. Kaski
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
H. Chen • H. Li • Y. Zhang • G. Zhang • J. Bi • P. Torr • J. Gu • D. Krompass • V. Tresp
FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
C. Curreli • D. Muhle • A. Saroha • Z. Ye • R. Marin • D. Cremers
Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Z. Chen • Y. Wang • L. Nan • X. Zhu
Parametric Point Cloud Completion for Polygonal Surface Reconstruction.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Parametric Point Cloud Completion for Polygonal Surface Reconstruction.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
S. Dziadzio • V. Udandarao • K. Roth • A. Prabhu • Z. Akata • S. Albanie • M. Bethge
How to Merge Your Multimodal Models Over Time?
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
How to Merge Your Multimodal Models Over Time?
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
T. Dagès • S. Weber • Y.-W. E. Lin • R. Talmon • D. Cremers • M. Lindenbaum • A. M. Bruckstein • R. Kimmel
Finsler Multi-Dimensional Scaling: Manifold Learning for Asymmetric Dimensionality Reduction and Embedding.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Finsler Multi-Dimensional Scaling: Manifold Learning for Asymmetric Dimensionality Reduction and Embedding.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
T. Hannan • M. M. Islam • J. Gu • T. Seidl • G. Bertasius
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
O. Hahn • C. Reich • N. Araslanov • D. Cremers • C. Rupprecht • S. Roth
Scene-Centric Unsupervised Panoptic Segmentation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Scene-Centric Unsupervised Panoptic Segmentation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
S. Kim • R. Xiao • M.-I. Georgescu • S. Alaniz • Z. Akata
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
T. Liu • Z. Lai • J. Wang • G. Zhang • S. Chen • P. Torr • V. Demberg • V. Tresp • J. Gu
Multimodal Pragmatic Jailbreak on Text-to-image Models.
CVPR 2025 - 2nd Workshop on Responsible Generative AI at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. Best Paper Award. URL GitHub
Multimodal Pragmatic Jailbreak on Text-to-image Models.
CVPR 2025 - 2nd Workshop on Responsible Generative AI at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. Best Paper Award. URL GitHub
W. Li • H. Xu • J. Huang • H. Jung • P. K. Yu • N. Navab • B. Busam
GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
D. Mildenberger • P. Hager • D. Rückert • M. J. Menten
A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
E. Özsoy • C. Pellegrini • T. Czempiel • F. Tristram • K. Yuan • D. Bani-Harouni • U. Eck • B. Busam • M. Keicher • N. Navab
MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
R. Qorbani • G. Villani • T. Panagiotakopoulos • M. B. Colomer • L. Härenstam-Nielsen • M. Segu • P. L. Dovesi • J. Karlgren • D. Cremers • F. Tombari • M. Poggi
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
K. Roth • Z. Akata • D. Damen • I. Balažević • O. J. Hénaff
Context-Aware Multimodal Pretraining.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Context-Aware Multimodal Pretraining.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
P. Roetzer • V. Ehm • D. Cremers • Z. Lähner • F. Bernard
Higher-Order Ratio Cycles for Fast and Globally Optimal Shape Matching.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Higher-Order Ratio Cycles for Fast and Globally Optimal Shape Matching.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
D. Schnaus • N. Araslanov • D. Cremers
It's a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
It's a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
N. Stracke • S. A. Baumann • K. Bauer • F. Fundel • B. Ommer
CleanDIFT: Diffusion Features without Noise.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
CleanDIFT: Diffusion Features without Noise.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
L. Sang • Z. Canfes • D. Cao • R. Marin • F. Bernard • D. Cremers
4Deform: Neural Surface Deformation for Robust Shape Interpolation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
4Deform: Neural Surface Deformation for Robust Shape Interpolation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
J. Schusterbauer • M. Gui • F. Fundel • B. Ommer
Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
D. Sinitsyn • L. Härenstam-Nielsen • D. Cremers
PRaDA: Projective Radial Distortion Averaging.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
PRaDA: Projective Radial Distortion Averaging.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
F. Wimbauer • W. Chen • D. Muhle • C. Rupprecht • D. Cremers
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Y. Xie • V. Ehm • P. Roetzer • N. Amrani • M. Gao • F. Bernard • D. Cremers
EchoMatch: Partial-to-Partial Shape Matching via Correspondence Reflection.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
EchoMatch: Partial-to-Partial Shape Matching via Correspondence Reflection.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
R. Xiao • S. Kim • M.-I. Georgescu • Z. Akata • S. Alaniz
FLAIR: VLM with Fine-grained Language-informed Image Representations.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
FLAIR: VLM with Fine-grained Language-informed Image Representations.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Y. Yeganeh • A. Farshad • I. Charisiadis • M. Hasny • M. Hartenberger • B. Ommer • N. Navab • E. Adeli
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. Highlight Paper. DOI
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. Highlight Paper. DOI
Y. Yuan • Y. Xia • D. Cremers • M. Sester
SparseAlign: a Fully Sparse Framework for Cooperative Object Detection.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
SparseAlign: a Fully Sparse Framework for Cooperative Object Detection.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
D. Zhu • Y. Di • S. Gavranovic • S. Ilic
SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
G. Zhang • M. L. A. Fok • J. Ma • Y. Xia • D. Cremers • P. Torr • V. Tresp • J. Gu
Localizing Events in Videos with Multimodal Queries.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Localizing Events in Videos with Multimodal Queries.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Workshops (6 papers)
L. Bastian • M. Rashed • N. Navab • T. Birdal
Continuous-Time SO(3) Forecasting with Savitzky--Golay Neural Controlled Differential Equations.
4DVision @CVPR 2025 - Workshop on 4D Vision: Modeling the Dynamic World at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. arXiv
Continuous-Time SO(3) Forecasting with Savitzky--Golay Neural Controlled Differential Equations.
4DVision @CVPR 2025 - Workshop on 4D Vision: Modeling the Dynamic World at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. arXiv
Y. Luo • R. Hoffmann • Y. Xia • O. Wysocki • B. Schwab • T. H. Kolbe • D. Cremers
RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning.
PBVS @CVPR 2025 - 21st IEEE Workshop on Perception Beyond the Visible Spectrum at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning.
PBVS @CVPR 2025 - 21st IEEE Workshop on Perception Beyond the Visible Spectrum at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
E. Özsoy • F. Holm • C. Pellegrini • T. Czempiel • M. Saleh • N. Navab • B. Busam
Location-Free Scene Graph Generation.
MULA @CVPR 2025 - 8th Multimodal Learning and Applications Workshop at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Location-Free Scene Graph Generation.
MULA @CVPR 2025 - 8th Multimodal Learning and Applications Workshop at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
W. Tang • W. Li • X. Liang • O. Wysocki • F. Biljecki • C. Holst • B. Jutzi
Texture2LoD3: Enabling LoD3 Building Reconstruction With Panoramic Images.
USM3D @CVPR 2025 - 2nd Workshop on Urban Scene Modeling at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Texture2LoD3: Enabling LoD3 Building Reconstruction With Panoramic Images.
USM3D @CVPR 2025 - 2nd Workshop on Urban Scene Modeling at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
L. Waldmann • A. Shah • Y. Wang • N. Lehmann • A. J. Stewart • Z. Xiong • X. Zhu • S. Bauer • J. Chuang
Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation.
EARTHVISION @CVPR 2025 - Workshop EarthVision: Large Scale Computer Vision for Remote Sensing Imagery at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation.
EARTHVISION @CVPR 2025 - Workshop EarthVision: Large Scale Computer Vision for Remote Sensing Imagery at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
D. Zverev • T. Wiedemer • A. Prabhu • M. Bethge • W. Brendel • A. S. Koepke
VGGSounder: Audio-Visual Evaluations for Foundation Models.
Sight and Sound @CVPR 2025 - Workshop Sight and Sound at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. PDF
VGGSounder: Audio-Visual Evaluations for Foundation Models.
Sight and Sound @CVPR 2025 - Workshop Sight and Sound at IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. PDF
Related
10.03.2026
Reinhard Heckel Featured in FAZ
MCML PI Reinhard Heckel, featured in FAZ, explains how better data boosts AI performance and reduces bias.
05.03.2026
Foundations of Diffusion: One Map for Images and Text
Unified diffusion theory for images and text, bridging continuous and discrete models in one clear framework for generative AI.