13.10.2024

©TUM
Success for the Group of Nassir Navab at MICCAI 2024
Multiple Awards Highlight Their Research in Medical Imaging and AI
The group of our PI Nassir Navab achieved remarkable recognition at MICCAI 2024, one of the world’s leading conferences in medical image computing and computer‑assisted interventions. Their outstanding research was honored with multiple prestigious awards across various tracks.
MICCAI Best Paper Runner‑up
ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling.
MICCAI 2024 - 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. Main Conference Best Paper Runner-up. DOI GitHub
Abstract
Every day, countless surgeries are performed worldwide, each within the distinct settings of operating rooms (ORs) that vary not only in their setups but also in the personnel, tools, and equipment used. This inherent diversity poses a substantial challenge for achieving a holistic understanding of the OR, as it requires models to generalize beyond their initial training datasets. To reduce this gap, we introduce ORacle, an advanced vision-language model designed for holistic OR domain modeling, which incorporates multi-view and temporal capabilities and can leverage external knowledge during inference, enabling it to adapt to previously unseen surgical scenarios. This capability is further enhanced by our novel data augmentation framework, which significantly diversifies the training dataset, ensuring ORacle’s proficiency in applying the provided knowledge effectively. In rigorous testing, in scene graph generation, and downstream tasks on the 4D-OR dataset, ORacle not only demonstrates state-of-the-art performance but does so requiring less data than existing models. Furthermore, its adaptability is displayed through its ability to interpret unseen views, actions, and appearances of tools and equipment. This demonstrates ORacle’s potential to significantly enhance the scalability and affordability of OR domain modeling and opens a pathway for future advancements in surgical data science.
MCML Authors
Computer Aided Medical Procedures & Augmented Reality
Computer Aided Medical Procedures & Augmented Reality
ICCAI GRAIL Best Paper
SANGRIA: Surgical Video Scene Graph Optimization for Surgical Workflow Prediction.
GRAIL @MICCAI 2024 - 6th Workshop on GRaphs in biomedicAl Image anaLysis at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024). Marrakesh, Morocco, Oct 06-10, 2024. GRAIL @MICCAI 2024 Best Paper. DOI
Abstract
Graph-based holistic scene representations facilitate surgical workflow understanding and have recently demonstrated significant success. However, this task is often hindered by the limited availability of densely annotated surgical scene data. In this work, we introduce an end-to-end framework for the generation and optimization of surgical scene graphs on a downstream task. Our approach leverages the flexibility of graph-based spectral clustering and the generalization capability of foundation models to generate unsupervised scene graphs with learnable properties. We reinforce the initial spatial graph with sparse temporal connections using local matches between consecutive frames to predict temporally consistent clusters across a temporal neighborhood. By jointly optimizing the spatiotemporal relations and node features of the dynamic scene graph with the downstream task of phase segmentation, we address the costly and annotation-burdensome task of semantic scene comprehension and scene graph generation in surgical videos using only weak surgical phase labels. Further, by incorporating effective intermediate scene representation disentanglement steps within the pipeline, our solution outperforms the SOTA on the CATARACTS dataset by 8% accuracy and 10% F1 score in surgical workflow recognition.
MCML Authors
Computer Aided Medical Procedures & Augmented Reality
MICCAI CLIP Best Paper
CloverNet – Leveraging Planning Annotations for Enhanced Procedural MR Segmentation: An Application to Adaptive Radiation Therapy.
CLIP @MICCAI 2024 - 13th International Workshop on Clinical Image-Based Procedures at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024). Marrakesh, Morocco, Oct 06-10, 2024. CLIP @MICCAI 2024 Best Paper. DOI
Abstract
In radiation therapy (RT), an accurate delineation of the regions of interest (ROI) and organs at risk (OAR) allows for a more targeted irradiation with reduced side effects. The current clinical workflow for combined MR-linear accelerator devices (MR-linacs) requires the acquisition of a planning MR volume (MR-P), in which the ROI and OAR are accurately segmented by the clinical team. These segmentation maps (S-P) are transferred to the MR acquired on the day of the RT fraction (MR-Fx) using registration, followed by time-consuming manual corrections. The goal of this paper is to enable accurate automatic segmentation of MR-Fx using S-P without clinical workflow disruption. We propose a novel UNet-based architecture, CloverNet, that takes as inputs MR-Fx and S-P in two separate encoder branches, whose latent spaces are concatenated in the bottleneck to generate an improved segmentation of MP-Fx. CloverNet improves the absolute Dice Score by 3.73% (relative +4.34%, p<0.001) when compared with conventional 3D UNet. Moreover, we believe this approach is potentially applicable to other longitudinal use cases in which a prior segmentation of the ROI is available.
MCML Authors
MICCAI EARTH Best Paper
VISAGE: Video Synthesis using Action Graphs for Surgery.
EARTH @MICCAI 2024 - Workshop on Embodied AI and Robotics for HealTHcare at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024). Marrakesh, Morocco, Oct 06-10, 2024. EARTH @MICCAI 2024 Best Paper. DOI
Abstract
Surgical data science (SDS) is a field that analyzes patient data before, during, and after surgery to improve surgical outcomes and skills. However, surgical data is scarce, heterogeneous, and complex, which limits the applicability of existing machine learning methods. In this work, we introduce the novel task of future video generation in laparoscopic surgery. This task can augment and enrich the existing surgical data and enable various applications, such as simulation, analysis, and robot-aided surgery. Ultimately, it involves not only understanding the current state of the operation but also accurately predicting the dynamic and often unpredictable nature of surgical procedures. Our proposed method, VISAGE (VIdeo Synthesis using Action Graphs for Surgery), leverages the power of action scene graphs to capture the sequential nature of laparoscopic procedures and utilizes diffusion models to synthesize temporally coherent video sequences. VISAGE predicts the future frames given only a single initial frame, and the action graph triplets. By incorporating domain-specific knowledge through the action graph, VISAGE ensures the generated videos adhere to the expected visual and motion patterns observed in real laparoscopic procedures. The results of our experiments demonstrate high-fidelity video generation for laparoscopy procedures, which enables various applications in SDS.
MCML Authors
Computer Aided Medical Procedures & Augmented Reality
Computer Aided Medical Procedures & Augmented Reality
MICCAI ASMUS Best Paper Runner‑up
PHOCUS: Physics-Based Deconvolution for Ultrasound Resolution Enhancement.
ASMUS @MICCAI 2024 - 5th International Workshop on Advances in Simplifying Medical Ultrasound at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024). Marrakesh, Morocco, Oct 06-10, 2024. ASMUS @MICCAI 2024 Best Paper. DOI
Abstract
Ultrasound is widely used in medical diagnostics allowing for accessible and powerful imaging but suffers from resolution limitations due to diffraction and the finite aperture of the imaging system, which restricts diagnostic use. The impulse function of an ultrasound imaging system is called the point spread function (PSF), which is convolved with the spatial distribution of reflectors in the image formation process. Recovering high-resolution reflector distributions by removing image distortions induced by the convolution process improves image clarity and detail. Conventionally, deconvolution techniques attempt to rectify the imaging system’s dependent PSF, working directly on the radio-frequency (RF) data. However, RF data is often not readily accessible. Therefore, we introduce a physics-based deconvolution process using a modeled PSF, working directly on the more commonly available B-mode images. By leveraging Implicit Neural Representations (INRs), we learn a continuous mapping from spatial locations to their respective echogenicity values, effectively compensating for the discretized image space. Our contribution consists of a novel methodology for retrieving a continuous echogenicity map directly from a B-mode image through a differentiable physics-based rendering pipeline for ultrasound resolution enhancement. We qualitatively and quantitatively evaluate our approach on synthetic data, demonstrating improvements over traditional methods in metrics such as PSNR and SSIM. Furthermore, we show qualitative enhancements on an ultrasound phantom and an in-vivo acquisition of a carotid artery.
MCML Authors
Computer Aided Medical Procedures & Augmented Reality
Computer Aided Medical Procedures & Augmented Reality
Computer Aided Medical Procedures & Augmented Reality
Congratulations to Nassir Navab and his entire team on this success at MICCAI 2024!
13.10.2024
Related

18.07.2025
Outstanding Paper Award at ICML 2025 for MCML Researchers
MCML researchers win ICML 2025 Outstanding Paper Award for work on prediction and identifying the worst-off.

15.07.2025
ERC Proof of Concept Grants for Fabian Theis
Fabian Theis receives ERC Proof of Concept Grants for his project on deep learning methods for dynamic single-cell data analysis.

15.07.2025
Nassir Navab Receives 2025 IEEE EMBS Technical Achievement Award
IEEE EMBS honors Nassir Navab for groundbreaking contributions to medical AR and surgical technology at the 2025 EMBC Conference.

©LMU
07.07.2025
Anne-Laure Boulesteix Receives Reinhart Koselleck Grant
Anne-Laure Boulesteix receives DFG grant for a project on improving empirical studies of statistical methods.