Research Group Björn Ommer
Björn Ommer
heads the Computer Vision & Learning Group at LMU Munich.
His research interests include all aspects of semantic image and video understanding based on (deep) machine learning. His special focus is on generative approaches for visual synthesis (e.g. Stable Diffusion), invertible deep models for explainable AI, deep metric and representation learning, and self-supervised learning paradigms and their interdisciplinary applications in the digital humanities and neurosciences.
Team members @MCML
PhD Students
Recent News @MCML
Publications @MCML
2026
[40]
Y. Qu • Q. Wang • Y. Mao • V. T. Hu • B. Ommer • X. Ji
Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?
KDD 2026 - 32nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Jeju Island, Republic of Korea, Aug 09-13, 2026. To be published. Preprint available. URL
Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?
KDD 2026 - 32nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Jeju Island, Republic of Korea, Aug 09-13, 2026. To be published. Preprint available. URL
[39]
S. A. Baumann • J. Wiese • T. Martorella • M. M. Kalayeh • B. Ommer
Envisioning the Future, One Step at a Time.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv URL
Envisioning the Future, One Step at a Time.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv URL
[38]
D. Kotovenko • O. Grebenkova • B. Ommer
EDGS: Eliminating Densification for Efficient Convergence of 3DGS.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv
EDGS: Eliminating Densification for Efficient Convergence of 3DGS.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv
[37]
F. Krause • S. A. Baumann • J. Schusterbauer • O. Grebenkova • M. Gui • V. T. Hu • B. Ommer
Guiding Token-Sparse Diffusion Models.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv
Guiding Token-Sparse Diffusion Models.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv
[36]
J. Schusterbauer • M. Gui • Y. Li • P. Ma • F. Krause • B. Ommer
Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv GitHub
Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv GitHub
[35]
J. Schusterbauer • J. Wiese • N. Stracke • T. Phan • B. Ommer
Probabilistic Precipitation Nowcasting with Rectified Flow Transformers.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. URL
Probabilistic Precipitation Nowcasting with Rectified Flow Transformers.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. URL
[34]
N. Stracke • K. Bauer • S. A. Baumann • M. A. Bautista • J. Susskind • B. Ommer
Learning Long-term Motion Embeddings for Efficient Kinematics Generation.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv URL
Learning Long-term Motion Embeddings for Efficient Kinematics Generation.
CVPR 2026 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Denver, CO, USA, Jun 03-07, 2026. To be published. Preprint available. arXiv URL
[33]
M. Gui • J. Schusterbauer • T. Phan • F. Krause • J. Susskind • M. A. Bautista • B. Ommer
Adapting Self-Supervised Representations as a Latent Space for Efficient Generation.
ICLR 2026 - 14th International Conference on Learning Representations. Rio de Janeiro, Brazil, Apr 23-27, 2026. To be published. Preprint available. arXiv
Adapting Self-Supervised Representations as a Latent Space for Efficient Generation.
ICLR 2026 - 14th International Conference on Learning Representations. Rio de Janeiro, Brazil, Apr 23-27, 2026. To be published. Preprint available. arXiv
[32]
R.-A. Matişan • V. T. Hu • G. Bartosh • B. Ommer • C. G. M. Snoek • M. Welling • J.-W. van de Meent • M. M. Derakhshani • F. Eijkelboom
Purrception: Variational Flow Matching for Vector-Quantized Image Generation.
ICLR 2026 - 14th International Conference on Learning Representations. Rio de Janeiro, Brazil, Apr 23-27, 2026. To be published. Preprint available. arXiv
Purrception: Variational Flow Matching for Vector-Quantized Image Generation.
ICLR 2026 - 14th International Conference on Learning Representations. Rio de Janeiro, Brazil, Apr 23-27, 2026. To be published. Preprint available. arXiv
[31]
M. Fuest • P. Ma • M. Gui • J. Schusterbauer • V. T. Hu • B. Ommer
Diffusion Models and Representation Learning: A Survey.
IEEE Transactions on Pattern Analysis and Machine Intelligence Early Access. Jan. 2026. DOI GitHub
Diffusion Models and Representation Learning: A Survey.
IEEE Transactions on Pattern Analysis and Machine Intelligence Early Access. Jan. 2026. DOI GitHub
2025
[30]
T. Ressler-Antal • F. Fundel • M. B. Alaya • S. A. Baumann • F. Krause • M. Gui • B. Ommer
DisMo: Disentangled Motion Representations for Open-World Motion Transfer.
NeurIPS 2025 - 39th Conference on Neural Information Processing Systems. San Diego, CA, USA, Nov 30-Dec 07, 2025. Spotlight Presentation. URL GitHub
DisMo: Disentangled Motion Representations for Open-World Motion Transfer.
NeurIPS 2025 - 39th Conference on Neural Information Processing Systems. San Diego, CA, USA, Nov 30-Dec 07, 2025. Spotlight Presentation. URL GitHub
[29]
S. A. Baumann • N. Stracke • T. Phan • B. Ommer
What If: Understanding Motion Through Sparse Interactions.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI GitHub
What If: Understanding Motion Through Sparse Interactions.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI GitHub
[28]
F. Krause • T. Phan • M. Gui • S. A. Baumann • V. T. Hu • B. Ommer
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI
[27]
P. Ma • M. Gui • J. Schusterbauer • X. Yang • O. Grebenkova • V. T. Hu • B. Ommer
Stochastic Interpolants for Revealing Stylistic Flows across the History of Art.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI GitHub
Stochastic Interpolants for Revealing Stylistic Flows across the History of Art.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI GitHub
[26]
P. Ma • X. Yang • Y. Li • M. Gui • F. Krause • J. Schusterbauer • B. Ommer
SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI GitHub
SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models.
ICCV 2025 - IEEE/CVF International Conference on Computer Vision. Honolulu, Hawai’i, Oct 19-23, 2025. DOI GitHub
[25]
Y. Li • R. Buchert • B. Schmitz-Koep • T. Grimmer • B. Ommer • D. M. Hedderich • I. Yakushev • C. Wachinger
Diffusion Bridge Networks Simulate Clinical-grade PET from MRI for Dementia Diagnostics.
Preprint (Oct. 2025). arXiv GitHub
Diffusion Bridge Networks Simulate Clinical-grade PET from MRI for Dementia Diagnostics.
Preprint (Oct. 2025). arXiv GitHub
[24]
C. Brandl • A.-K. Nitschke • F. Egersdoerfer • B. Ommer • M. Weidemüller
A Personalized and Evidence-Based Clinical Decision Support System Using Ensemble Learning.
EMBC 2025 - 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Copenhagen, Denmark, Jul 14-18, 2025. DOI
A Personalized and Evidence-Based Clinical Decision Support System Using Ensemble Learning.
EMBC 2025 - 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Copenhagen, Denmark, Jul 14-18, 2025. DOI
[23]
S. A. Baumann • F. Krause • M. Neumayr • N. Stracke • M. Sevi • V. T. Hu • B. Ommer
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI GitHub
[22]
J. Schusterbauer • M. Gui • F. Fundel • B. Ommer
Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
[21]
N. Stracke • S. A. Baumann • K. Bauer • F. Fundel • B. Ommer
CleanDIFT: Diffusion Features without Noise.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
CleanDIFT: Diffusion Features without Noise.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
[20]
Y. Yeganeh • A. Farshad • I. Charisiadis • M. Hasny • M. Hartenberger • B. Ommer • N. Navab • E. Adeli
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. Highlight Paper. DOI
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. Highlight Paper. DOI
[19]
A. Aghdam • V. T. Hu • B. Ommer
ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment.
Preprint (Jun. 2025). arXiv
ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment.
Preprint (Jun. 2025). arXiv
[18]
E. Abdelrahman • L. Zhao • V. T. Hu • M. Cord • P. Perez • M. Elhoseiny
ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. URL GitHub
ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. URL GitHub
[17]
F. Fundel • J. Schusterbauer • V. T. Hu • B. Ommer
Distillation of Diffusion Features for Semantic Correspondence.
WACV 2025 - IEEE/CVF Winter Conference on Applications of Computer Vision. Tucson, AZ, USA, Feb 28-Mar 04, 2025. DOI
Distillation of Diffusion Features for Semantic Correspondence.
WACV 2025 - IEEE/CVF Winter Conference on Applications of Computer Vision. Tucson, AZ, USA, Feb 28-Mar 04, 2025. DOI
[16]
A. Davtyan • S. Sameni • B. Ommer • P. Favaro
CAGE: Unsupervised Visual Composition and Animation for Controllable Video Generation.
AAAI 2025 - 39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025. DOI GitHub
CAGE: Unsupervised Visual Composition and Animation for Controllable Video Generation.
AAAI 2025 - 39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025. DOI GitHub
[15]
M. Gui • J. Schusterbauer • U. Prestel • P. Ma • D. Kotovenko • O. Grebenkova • S. A. Baumann • V. T. Hu • B. Ommer
DepthFM: Fast Generative Monocular Depth Estimation with Flow Matching.
AAAI 2025 - 39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025. Oral Presentation. DOI
DepthFM: Fast Generative Monocular Depth Estimation with Flow Matching.
AAAI 2025 - 39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025. Oral Presentation. DOI
[14]
P. Ma • L. Rietdorf • D. Kotovenko • V. T. Hu • B. Ommer
Does VLM Classification Benefit from LLM Description Semantics?
Invited Talk @AAAI 2025 - 39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025. Invited Talk. DOI
Does VLM Classification Benefit from LLM Description Semantics?
Invited Talk @AAAI 2025 - 39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025. Invited Talk. DOI
[13]
M. Fuest • V. T. Hu • B. Ommer
MaskFlow: Discrete Flows For Flexible and Efficient Long Video Generation.
Preprint (Feb. 2025). arXiv
MaskFlow: Discrete Flows For Flexible and Efficient Long Video Generation.
Preprint (Feb. 2025). arXiv
[12]
E. Eulig • F. Jäger • J. Maier • B. Ommer • M. Kachelrieß
Reconstructing and analyzing the invariances of low-dose CT image denoising networks.
Medical Physics 52. Jan. 2025. DOI
Reconstructing and analyzing the invariances of low-dose CT image denoising networks.
Medical Physics 52. Jan. 2025. DOI
2024
[11]
J. Wang • M. Ghahremani • Y. Li • B. Ommer • C. Wachinger
Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation.
NeurIPS 2024 - 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. DOI GitHub
Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation.
NeurIPS 2024 - 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. DOI GitHub
[10]
[9]
J. Wang • Z. Qin • Y. Zhang • V. T. Hu • B. Ommer • R. Briq • S. Kesselheim
Scaling Image Tokenizers with Grouped Spherical Quantization.
Preprint (Dec. 2024). arXiv
Scaling Image Tokenizers with Grouped Spherical Quantization.
Preprint (Dec. 2024). arXiv
[8]
V. T. Hu • S. A. Baumann • M. Gui • O. Grebenkova • P. Ma • J. Schusterbauer • B. Ommer
ZigMa: A DiT-style Zigzag Mamba Diffusion Model.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. DOI GitHub
ZigMa: A DiT-style Zigzag Mamba Diffusion Model.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. DOI GitHub
[7]
D. Kotovenko • O. Grebenkova • N. Sarafianos • A. Paliwal • P. Ma • O. Poursaeed • S. Mohan • Y. Fan • Y. Li • R. Ranjan • B. Ommer
WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. DOI GitHub
WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. DOI GitHub
[6]
N. Stracke • S. A. Baumann • J. M. Susskind • M. A. Bautista • B. Ommer
CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control and Altering of T2I Models.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. DOI GitHub
CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control and Altering of T2I Models.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. DOI GitHub
[5]
J. Schusterbauer • M. Gui • P. Ma • N. Stracke • S. A. Baumann • V. T. Hu • B. Ommer
FMBoost: Boosting Latent Diffusion with Flow Matching.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. Oral Presentation. DOI GitHub
FMBoost: Boosting Latent Diffusion with Flow Matching.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. Oral Presentation. DOI GitHub
[4]
E. Eulig • B. Ommer • M. Kachelrieß
Benchmarking deep learning-based low-dose CT image denoising algorithms.
Medical Physics 51. Sep. 2024. DOI
Benchmarking deep learning-based low-dose CT image denoising algorithms.
Medical Physics 51. Sep. 2024. DOI
2023
[3]
A. Farshad • Y. Yeganeh • Y. Chi • C. Shen • B. Ommer • N. Navab
Scenegenie: Scene graph guided diffusion models for image synthesis.
Workshop @ICCV 2023 - Workshop at the IEEE/CVF International Conference on Computer Vision. Paris, France, Oct 02-06, 2023. DOI
Scenegenie: Scene graph guided diffusion models for image synthesis.
Workshop @ICCV 2023 - Workshop at the IEEE/CVF International Conference on Computer Vision. Paris, France, Oct 02-06, 2023. DOI
[2]
D. Kotovenko • P. Ma • T. Milbich • B. Ommer
Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning.
CVPR 2023 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada, Jun 18-23, 2023. DOI
Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning.
CVPR 2023 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada, Jun 18-23, 2023. DOI
2022
[1]
A. Blattmann • R. Rombach • K. Oktay • B. Ommer
Retrieval-Augmented Diffusion Models.
NeurIPS 2022 - 36th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Nov 28-Dec 09, 2022. DOI
Retrieval-Augmented Diffusion Models.
NeurIPS 2022 - 36th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Nov 28-Dec 09, 2022. DOI
©all images: LMU | TUM
Back to Top
2024-12-27 - Last modified: 2026-06-02