Home | Research | Groups | Daniel Rückert

Research Group Daniel Rückert


Daniel Rückert

is Alexander von Humboldt Professor for AI in Medicine and Healthcare at TU Munich. He is also a Professor at Imperial College London.

He gained a MSc from Technical University Berlin in 1993, a PhD from Imperial College in 1997, followed by a post-doc at King’s College London. In 1999 he joined Imperial College as a Lecturer, becoming Senior Lecturer in 2003 and full Professor in 2005. From 2016 to 2020 he served as Head of the Department of Computing at Imperial College. His field of research is the area of Artificial Intelligence and Machine Learning and their application to medicine and healthcare. In 2025, he received Germany’s highest research honor, the prestigious Gottfried Wilhelm Leibniz Prize for his groundbreaking work in AI-assisted medical imaging.

Team members @MCML

PostDocs

Link to website

Julian Suk

Dr.

PhD Students

Link to website

Varma Aswathi

Link to website

Niklas Bubeck

Link to website

Laurin Lux

Link to website

David Mildenberger

Link to website

Nil Stolt-Ansó

Link to website

Reihaneh Torkzadehmahani

Link to website

Clara Sophie Vetter

Recent News @MCML

Link to AI for Personalized Psychiatry - With Researcher Clara Vetter

01.09.2025

AI for Personalized Psychiatry - With Researcher Clara Vetter

Link to AI Research by Daniel Rückert Improves Medical Imaging and Data Privacy

29.07.2025

AI Research by Daniel Rückert Improves Medical Imaging and Data Privacy

Link to MCML Researchers With 34 Papers at CVPR 2025

10.06.2025

MCML Researchers With 34 Papers at CVPR 2025

Link to Daniel Rückert Elected Fellow of the Royal Society

04.06.2025

Daniel Rückert Elected Fellow of the Royal Society

Link to MCML Researchers With 52 Papers at ICLR 2025

23.04.2025

MCML Researchers With 52 Papers at ICLR 2025

Publications @MCML

2025


[74] A Conference
M. Dannecker and D. Rückert.
Predicting Longitudinal Brain Development via Implicit Neural Representations.
MICCAI 2025 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention. Daejeon, Republic of Korea, Sep 23-27, 2025. To be published. Preprint available. PDF
Abstract

Predicting individualized perinatal braindevelopment is crucial for understanding personalized neurodevelopmental trajectories, however, remains challenging due to limited longitudinal data. While popu-ation based atlases model generic trends, they fail to capture subject-specific growth patterns. In this work, we propose a novel approach leveraging Implicit Neural Representations (INRs) to predict individualized brain growth over multiple weeks. Our method learns from a limited dataset of less than 100 paired fetal and neonatal subjects, sampled from the developing Human Connectome Project. The trained model demonstrates accurate personalized future and past trajectory predictions from a single calibration scan. By incorporating conditional external factors such as birth age or birth weight, our model further allows the simulation of neurodevelopment under varying conditions. We evaluate our method against established perinatal brain atlases, demonstrating higher prediction accuracy and fidelity up to 20 weeks. Finally, we explore the method’s ability to reveal subject-specific cortical folding patterns under varying factors like birth weight, further advocating its potential for
personalized neurodevelopmental analysis.

MCML Authors

[73] A Conference
D. Scholz, A. C. Erdur, V. Ehm, A. Meyer-Baese, J. C. Peeken, D. Rückert and B. Wiestler.
MM-DINOv2: Adapting Foundation Models for Multi-Modal Medical Image Analysis.
MICCAI 2025 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention. Daejeon, Republic of Korea, Sep 23-27, 2025. To be published. Preprint available. arXiv
Abstract

Vision foundation models like DINOv2 demonstrate remarkable potential in medical imaging despite their origin in natural image domains. However, their design inherently works best for uni-modal image analysis, limiting their effectiveness for multi-modal imaging tasks that are common in many medical fields, such as neurology and oncology. While supervised models perform well in this setting, they fail to leverage unlabeled datasets and struggle with missing modalities, a frequent challenge in clinical settings. To bridge these gaps, we introduce MM-DINOv2, a novel and efficient framework that adapts the pre-trained vision foundation model DINOv2 for multi-modal medical imaging. Our approach incorporates multi-modal patch embeddings, enabling vision foundation models to effectively process multi-modal imaging data. To address missing modalities, we employ full-modality masking, which encourages the model to learn robust cross-modality relationships. Furthermore, we leverage semi-supervised learning to harness large unlabeled datasets, enhancing both the accuracy and reliability of medical predictions. Applied to glioma subtype classification from multi-sequence brain MRI, our method achieves a Matthews Correlation Coefficient (MCC) of 0.6 on an external test set, surpassing state-of-the-art supervised approaches by +11.1%. Our work establishes a scalable and robust solution for multi-modal medical imaging tasks, leveraging powerful vision foundation models pre-trained on natural images while addressing real-world clinical challenges such as missing data and limited annotations.

MCML Authors

[72] A Conference
D. Scholz, A. C. Erdur, R. Holland, V. Ehm, J. C. Peeken, B. Wiestler and D. Rückert.
Contrastive Anatomy-Contrast Disentanglement: A Domain-General MRI Harmonization Method.
MICCAI 2025 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention. Daejeon, Republic of Korea, Sep 23-27, 2025. To be published. Preprint available. arXiv
Abstract

Magnetic resonance imaging (MRI) is an invaluable tool for clinical and research applications. Yet, variations in scanners and acquisition parameters cause inconsistencies in image contrast, hindering data comparability and reproducibility across datasets and clinical studies. Existing scanner harmonization methods, designed to address this challenge, face limitations, such as requiring traveling subjects or struggling to generalize to unseen domains. We propose a novel approach using a conditioned diffusion autoencoder with a contrastive loss and domain-agnostic contrast augmentation to harmonize MR images across scanners while preserving subject-specific anatomy. Our method enables brain MRI synthesis from a single reference image. It outperforms baseline techniques, achieving a +7% PSNR improvement on a traveling subjects dataset and +18% improvement on age regression in unseen. Our model provides robust, effective harmonization of brain MRIs to target scanners without requiring fine-tuning. This advancement promises to enhance comparability, reproducibility, and generalizability in multi-site and longitudinal clinical studies, ultimately contributing to improved healthcare outcomes.

MCML Authors

[71] A Conference
A. Selivanov, P. Müller, Ö. Turgut, N. Stolt-Ansó and D. Rückert.
Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG.
MICCAI 2025 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention. Daejeon, Republic of Korea, Sep 23-27, 2025. To be published. Preprint available. arXiv GitHub
Abstract

An electrocardiogram (ECG) is a widely used, cost-effective tool for detecting electrical abnormalities in the heart. However, it cannot directly measure functional parameters, such as ventricular volumes and ejection fraction, which are crucial for assessing cardiac function. Cardiac magnetic resonance (CMR) is the gold standard for these measurements, providing detailed structural and functional insights, but is expensive and less accessible. To bridge this gap, we propose PTACL (Patient and Temporal Alignment Contrastive Learning), a multimodal contrastive learning framework that enhances ECG representations by integrating spatio-temporal information from CMR. PTACL uses global patient-level contrastive loss and local temporal-level contrastive loss. The global loss aligns patient-level representations by pulling ECG and CMR embeddings from the same patient closer together, while pushing apart embeddings from different patients. Local loss enforces fine-grained temporal alignment within each patient by contrasting encoded ECG segments with corresponding encoded CMR frames. This approach enriches ECG representations with diagnostic information beyond electrical activity and transfers more insights between modalities than global alignment alone, all without introducing new learnable weights. We evaluate PTACL on paired ECG-CMR data from 27,951 subjects in the UK Biobank. Compared to baseline approaches, PTACL achieves better performance in two clinically relevant tasks: (1) retrieving patients with similar cardiac phenotypes and (2) predicting CMR-derived cardiac function parameters, such as ventricular volumes and ejection fraction. Our results highlight the potential of PTACL to enhance non-invasive cardiac diagnostics using ECG.

MCML Authors

[70] A Conference
T. Susetzky, H. Qiu1, R. Braren and D. Rückert.
A Holistic Time-Aware Classification Model for Multimodal Longitudinal Patient Data.
MICCAI 2025 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention. Daejeon, Republic of Korea, Sep 23-27, 2025. To be published. Preprint available. PDF GitHub
Abstract

Current prognostic and diagnostic AI models for healthcare often limit informational input capacity by being time-agnostic and focusing on single modalities, therefore lacking the holistic perspective clinicians rely on. To address this, we introduce a Time-Aware Multi Modal Transformer Encoder (TAMME) for longitudinal medical data. Unlike most state-of-the-art models, TAMME integrates longitudinal imaging, textual, numerical, and categorical data together with temporal information. Each element is represented as the sum of embeddings for high-level categorical type, further specification of this type, time-related data, and value. This composition overcomes limitations of a closed input vocabulary, enabling generalization to novel data. Additionally, with temporal context including the delta to the preceding element, we eliminate the requirement for evenly sampled input sequences. For long-term EHRs, the model employs a novel summarization mechanism that processes sequences piecewise and prepends recent data with history representations in end-to-end training. This enables balancing recent information with historical signals via self-attention. We demonstrate TAMME’s capabilities using data from 431k+ hospital stays, 73k ICU stays, and 425k Emergency Department (ED) visits from the MIMIC dataset for clinical classification tasks: prediction of triage acuity, length of stay, and readmission. We show superior performance over state-of-the-art approaches especially gained from long-term data. Overall, our approach provides versatile processing of entire patient trajectories as a whole to enhance predictive performance on clinical tasks.

MCML Authors

[69] A Conference
J. Suk, J. J. Wentzel, P. Rygiel, J. Daemen, D. Rückert and J. M. Wolterink.
GReAT: leveraging geometric artery data to improve wall shear stress assessment.
ShapeMI @MICCAI 2025 - Workshop on Shape in Medical Imaging at the 28th International Conference on Medical Image Computing and Computer Assisted Intervention. Daejeon, Republic of Korea, Sep 23-27, 2025. To be published. Preprint available. arXiv
Abstract

Leveraging big data for patient care is promising in many medical fields such as cardiovascular health. For example, hemodynamic biomarkers like wall shear stress could be assessed from patient-specific medical images via machine learning algorithms, bypassing the need for time-intensive computational fluid simulation. However, it is extremely challenging to amass large-enough datasets to effectively train such models. We could address this data scarcity by means of self-supervised pre-training and foundations models given large datasets of geometric artery models. In the context of coronary arteries, leveraging learned representations to improve hemodynamic biomarker assessment has not yet been well studied. In this work, we address this gap by investigating whether a large dataset (8449 shapes) consisting of geometric models of 3D blood vessels can benefit wall shear stress assessment in coronary artery models from a small-scale clinical trial (49 patients). We create a self-supervised target for the 3D blood vessels by computing the heat kernel signature, a quantity obtained via Laplacian eigenvectors, which captures the very essence of the shapes. We show how geometric representations learned from this datasets can boost segmentation of coronary arteries into regions of low, mid and high (time-averaged) wall shear stress even when trained on limited data.

MCML Authors

[68]
C. Liu, Y. Chen, H. Shi, J. Lu, B. Jian, J. Pan, L. Cai, J. Wang, Y. Zhang, J. Li, C. I. Bercea, C. Ouyang, C. Chen, Z. Xiong, B. Wiestler, C. Wachinger, D. Rückert, W. Bai and R. Arcucci.
Does DINOv3 Set a New Medical Vision Standard?
Preprint (Sep. 2025). arXiv
Abstract

The advent of large-scale vision foundation models, pre-trained on diverse natural images, has marked a paradigm shift in computer vision. However, how the frontier vision foundation models’ efficacies transfer to specialized domains remains such as medical imaging remains an open question. This report investigates whether DINOv3, a state-of-the-art self-supervised vision transformer (ViT) that features strong capability in dense prediction tasks, can directly serve as a powerful, unified encoder for medical vision tasks without domain-specific pre-training. To answer this, we benchmark DINOv3 across common medical vision tasks, including 2D/3D classification and segmentation on a wide range of medical imaging modalities. We systematically analyze its scalability by varying model sizes and input image resolutions. Our findings reveal that DINOv3 shows impressive performance and establishes a formidable new baseline. Remarkably, it can even outperform medical-specific foundation models like BiomedCLIP and CT-Net on several tasks, despite being trained solely on natural images. However, we identify clear limitations: The model’s features degrade in scenarios requiring deep domain specialization, such as in Whole-Slide Pathological Images (WSIs), Electron Microscopy (EM), and Positron Emission Tomography (PET). Furthermore, we observe that DINOv3 does not consistently obey scaling law in the medical domain; performance does not reliably increase with larger models or finer feature resolutions, showing diverse scaling behaviors across tasks. Ultimately, our work establishes DINOv3 as a strong baseline, whose powerful visual features can serve as a robust prior for multiple complex medical tasks. This opens promising future directions, such as leveraging its features to enforce multiview consistency in 3D reconstruction.

MCML Authors

[67] Top Journal
S. Starck, V. Sideri-Lampretsa, B. Kainz, M. Menten, T. T. Mueller and D. Rückert.
Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases.
IEEE Transactions on Medical Imaging Early Access (Aug. 2025). DOI
Abstract

Anatomical atlases are widely used for population studies and analysis. Conditional atlases target a specific sub-population defined via certain conditions, such as demographics or pathologies, and allow for the investigation of fine-grained anatomical differences like morphological changes associated with ageing or disease. Existing approaches use either registration-based methods that are often unable to handle large anatomical variations or generative adversarial models, which are challenging to train since they can suffer from training instabilities. Instead of generating atlases directly in as intensities, we propose using latent diffusion models to generate deformation fields, which transform a general population atlas into one representing a specific sub-population. Our approach ensures structural integrity, enhances interpretability and avoids hallucinations that may arise during direct image synthesis by generating this deformation field and regularising it using a neighbourhood of images. We compare our method to several state-of-the-art atlas generation methods using brain MR images from the UK Biobank. Our method generates highly realistic atlases with smooth transformations and high anatomical fidelity, outperforming existing baselines. We demonstrate the quality of these atlases through comprehensive evaluations, including quantitative metrics for anatomical accuracy, perceptual similarity, and qualitative analyses displaying the consistency and realism of the generated atlases.

MCML Authors

[66] Top Journal
F. Drexel, V. Sideri-Lampretsa, H. Bast, A. W. Marka, T. Koehler, F. T. Gassert, D. Pfeiffer, D. Rückert and F. Pfeiffer.
Deformable image registration of dark-field chest radiographs for functional lung assessment.
Medical Physics 52.8 (Aug. 2025). DOI
Abstract

Background: Dark-field radiography of the human chest has been demonstrated to have promising potential for the analysis of the lung microstructure and the diagnosis of respiratory diseases. However, most previous studies of dark-field chest radiographs evaluated the lung signal only in the inspiratory breathing state.
Purpose: Our work aims to add a new perspective to these previous assessments by locally comparing dark-field lung information between different respiratory states to explore new ways of functional lung imaging based on dark-field chest radiography.
Methods: We use suitable deformable image registration methods for dark-field chest radiographs to establish a mapping of lung areas in distinct breathing states. After registration, we utilize an inter-frame ratio approach to examine the local dark-field signal changes and evaluate the gradient of the craniocaudal axis projections and mean lung field values to draw a quantitative comparison to standard chest radiographs and assess the relationship with the respiratory capacity.
Results: Considering full inspiration and expiration scans from a clinical chronic obstructive pulmonary disease study, the registration framework allows to establish an accurate spatial correspondence (Median Dice score 0.95/0.94, mean surface distance 3.71/3.52 mm, and target registration error 6.10 mm) between dark-field chest radiographs in different respiratory states and thus to perform a local signal change analysis. Compared to the utilization of standard chest radiographs, the presented approach benefits from the absence of bone and soft-tissue structures in the dark-field images, which move differently during respiration than the lung tissue. Our quantitative evaluation of the inter-frame ratios demonstrates evidence of craniocaudal gradient-sensitivity advantages concerning the relative vital lung capacity of the study participants in the dark-field images (Spearman correlation coefficients: $r_{s,right}=0.55$, $p<0.01$ and $r_{s,left}=0.48$, $p<0.01$ compared to the attenuation image-based gradient correlations $r_{s,right}=0.20$, $p=0.16$ and $r_{s,left}=0.40$, $p<0.01$). Moreover, our alternative lung field analysis approach provides insights into the distinct behavior of the dark-field signal changes with the breathing capacity, which are in good agreement with the expected lung volume changes in the respective lung regions. In quantitative terms, this is reflected in a weak Spearman correlation ($r_{s,mathrm{upper}}=0.30$, $p=0.01$) of the mean dark-field signal ratio within the upper lung region, but strong correlations within the middle ($r_{s,mathrm{middle}}=0.71$, $p<0.01$) and lower ($r_{s,mathrm{lower}}=0.67$, $p<0.01$) lung region.
Conclusions: Our regional characterization of lung dark-field signal changes between the breathing states via deformable image registration provides a proof-of-principle that dynamic radiography-based lung function assessment approaches may benefit from considering registered dark-field images in addition to standard plain chest radiographs. This opens up new options for low-dose and rapid lung ventilation assessment via dark-field chest radiography that has the potential to improve lung diagnostics considerably.

MCML Authors

[65]
N. Bubeck, S. Shit, C. Chen, C. Zhao, P. Guo, D. Yang, G. Zitzlsberger, D. Xu, B. Kainz, D. Rückert and J. Pan.
Latent Interpolation Learning Using Diffusion Models for Cardiac Volume Reconstruction.
Preprint (Aug. 2025). arXiv
Abstract

Cardiac Magnetic Resonance (CMR) imaging is a critical tool for diagnosing and managing cardiovascular disease, yet its utility is often limited by the sparse acquisition of 2D short-axis slices, resulting in incomplete volumetric information. Accurate 3D reconstruction from these sparse slices is essential for comprehensive cardiac assessment, but existing methods face challenges, including reliance on predefined interpolation schemes (e.g., linear or spherical), computational inefficiency, and dependence on additional semantic inputs such as segmentation labels or motion data. To address these limitations, we propose a novel Cardiac Latent Interpolation Diffusion (CaLID) framework that introduces three key innovations. First, we present a data-driven interpolation scheme based on diffusion models, which can capture complex, non-linear relationships between sparse slices and improves reconstruction accuracy. Second, we design a computationally efficient method that operates in the latent space and speeds up 3D whole-heart upsampling time by a factor of 24, reducing computational overhead compared to previous methods. Third, with only sparse 2D CMR images as input, our method achieves SOTA performance against baseline methods, eliminating the need for auxiliary input such as morphological guidance, thus simplifying workflows. We further extend our method to 2D+T data, enabling the effective modeling of spatiotemporal dynamics and ensuring temporal coherence. Extensive volumetric evaluations and downstream segmentation tasks demonstrate that CaLID achieves superior reconstruction quality and efficiency. By addressing the fundamental limitations of existing approaches, our framework advances the state of the art for spatio and spatiotemporal whole-heart reconstruction, offering a robust and clinically practical solution for cardiovascular imaging.

MCML Authors

[64]
B. Bulut, M. Dannecker, T. Sanchez, S. N. Silva, V. Zalevskyi, S. Jia, J.-B. Ledoux, G. Auzias, F. Rousseau, J. Hutter, D. Rückert and M. Bach Cuadra.
Physics-Informed Joint Multi-TE Super-Resolution with Implicit Neural Representation for Robust Fetal T2 Mapping.
Preprint (Aug. 2025). arXiv
Abstract

T2 mapping in fetal brain MRI has the potential to improve characterization of the developing brain, especially at mid-field (0.55T), where T2 decay is slower. However, this is challenging as fetal MRI acquisition relies on multiple motion-corrupted stacks of thick slices, requiring slice-to-volume reconstruction (SVR) to estimate a high-resolution (HR) 3D volume. Currently, T2 mapping involves repeated acquisitions of these stacks at each echo time (TE), leading to long scan times and high sensitivity to motion. We tackle this challenge with a method that jointly reconstructs data across TEs, addressing severe motion. Our approach combines implicit neural representations with a physics-informed regularization that models T2 decay, enabling information sharing across TEs while preserving anatomical and quantitative T2 fidelity. We demonstrate state-of-the-art performance on simulated fetal brain and in vivo adult datasets with fetal-like motion. We also present the first in vivo fetal T2 mapping results at 0.55T. Our study shows potential for reducing the number of stacks per TE in T2 mapping by leveraging anatomical redundancy.

MCML Authors

[63]
T. Mach, D. Rückert, A. Berger, L. Lux and I. Ezhov.
Addressing Annotation Scarcity in Hyperspectral Brain Image Segmentation with Unsupervised Domain Adaptation.
Preprint (Aug. 2025). arXiv
Abstract

This work presents a novel deep learning framework for segmenting cerebral vasculature in hyperspectral brain images. We address the critical challenge of severe label scarcity, which impedes conventional supervised training. Our approach utilizes a novel unsupervised domain adaptation methodology, using a small, expert-annotated ground truth alongside unlabeled data. Quantitative and qualitative evaluations confirm that our method significantly outperforms existing state-of-the-art approaches, demonstrating the efficacy of domain adaptation for label-scarce biomedical imaging tasks.

MCML Authors

[62]
J. Pan, B. Jian, P. Hager, Y. Zhang, C. Liu, F. Jungmann, H. B. Li, C. You, J. Wu, J. Zhu, F. Liu, Y. Liu, N. Bubeck, C. Wachinger, C. Chen, Z. Gong, C. Ouyang, G. Kaissis, B. Wiestler and D. Rückert.
Beyond Benchmarks: Dynamic, Automatic And Systematic Red-Teaming Agents For Trustworthy Medical Language Models.
Preprint (Aug. 2025). arXiv
Abstract

Ensuring the safety and reliability of large language models (LLMs) in clinical practice is critical to prevent patient harm and promote trustworthy healthcare applications of AI. However, LLMs are advancing so rapidly that static safety benchmarks often become obsolete upon publication, yielding only an incomplete and sometimes misleading picture of model trustworthiness. We demonstrate that a Dynamic, Automatic, and Systematic (DAS) red-teaming framework that continuously stress-tests LLMs can reveal significant weaknesses of current LLMs across four safety-critical domains: robustness, privacy, bias/fairness, and hallucination. A suite of adversarial agents is applied to autonomously mutate test cases, identify/evolve unsafe-triggering strategies, and evaluate responses, uncovering vulnerabilities in real time without human intervention. Applying DAS to 15 proprietary and open-source LLMs revealed a stark contrast between static benchmark performance and vulnerability under adversarial pressure. Despite a median MedQA accuracy exceeding 80%, 94% of previously correct answers failed our dynamic robustness tests. We observed similarly high failure rates across other domains: privacy leaks were elicited in 86% of scenarios, cognitive-bias priming altered clinical recommendations in 81% of fairness tests, and we identified hallucination rates exceeding 66% in widely used models. Such profound residual risks are incompatible with routine clinical practice. By converting red-teaming from a static checklist into a dynamic stress-test audit, DAS red-teaming offers the surveillance that hospitals/regulators/technology vendors require as LLMs become embedded in patient chatbots, decision-support dashboards, and broader healthcare workflows. Our framework delivers an evolvable, scalable, and reliable safeguard for the next generation of medical AI.

MCML Authors

[61]
N. Bubeck, Y. Zhang, S. Shit, D. Rückert and J. Pan.
Reconstruct or Generate: Exploring the Spectrum of Generative Modeling for Cardiac MRI.
Preprint (Jul. 2025). arXiv
Abstract

In medical imaging, generative models are increasingly relied upon for two distinct but equally critical tasks: reconstruction, where the goal is to restore medical imaging (usually inverse problems like inpainting or superresolution), and generation, where synthetic data is created to augment datasets or carry out counterfactual analysis. Despite shared architecture and learning frameworks, they prioritize different goals: generation seeks high perceptual quality and diversity, while reconstruction focuses on data fidelity and faithfulness. In this work, we introduce a ‘generative model zoo’ and systematically analyze how modern latent diffusion models and autoregressive models navigate the reconstruction-generation spectrum. We benchmark a suite of generative models across representative cardiac medical imaging tasks, focusing on image inpainting with varying masking ratios and sampling strategies, as well as unconditional image generation. Our findings show that diffusion models offer superior perceptual quality for unconditional generation but tend to hallucinate as masking ratios increase, whereas autoregressive models maintain stable perceptual performance across masking levels, albeit with generally lower fidelity.

MCML Authors

[60]
A. F. Dima, S. Shit, H. Qiu, R. Holland, T. T. Mueller, F. A. Musio, K. Yang, B. Menze, R. Braren, M. Makowski and D. Rückert.
Parametric shape models for vessels learned from segmentations via differentiable voxelization.
Preprint (Jul. 2025). arXiv
Abstract

Vessels are complex structures in the body that have been studied extensively in multiple representations. While voxelization is the most common of them, meshes and parametric models are critical in various applications due to their desirable properties. However, these representations are typically extracted through segmentations and used disjointly from each other. We propose a framework that joins the three representations under differentiable transformations. By leveraging differentiable voxelization, we automatically extract a parametric shape model of the vessels through shape-to-segmentation fitting, where we learn shape parameters from segmentations without the explicit need for ground-truth shape parameters. The vessel is parametrized as centerlines and radii using cubic B-splines, ensuring smoothness and continuity by construction. Meshes are differentiably extracted from the learned shape parameters, resulting in high-fidelity meshes that can be manipulated post-fit. Our method can accurately capture the geometry of complex vessels, as demonstrated by the volumetric fits in experiments on aortas, aneurysms, and brain vessels.

MCML Authors

[59]
J. Weidner, I. Ezhov, M. Balcerak, A. Datchev, L. Zimmer, D. Rückert, B. Menze and B. Wiestler.
From Fiber Tracts to Tumor Spread: Biophysical Modeling of Butterfly Glioma Growth Using Diffusion Tensor Imaging.
Preprint (Jul. 2025). arXiv
Abstract

Butterfly tumors are a distinct class of gliomas that span the corpus callosum, producing a characteristic butterfly-shaped appearance on MRI. The distinctive growth pattern of these tumors highlights how white matter fibers and structural connectivity influence brain tumor cell migration. To investigate this relation, we applied biophysical tumor growth models to a large patient cohort, systematically comparing models that incorporate fiber tract information with those that do not. Our results demonstrate that including fiber orientation data significantly improves model accuracy, particularly for a subset of butterfly tumors. These findings highlight the critical role of white matter architecture in tumor spread and suggest that integrating fiber tract information can enhance the precision of radiotherapy target volume delineation.

MCML Authors

[58] A* Conference
D. Mildenberger, P. Hager, D. Rückert and M. Menten.
A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets.
CVPR 2025 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA, Jun 11-15, 2025. DOI
Abstract

Supervised contrastive learning (SupCon) has proven to be a powerful alternative to the standard cross-entropy loss for classification of multi-class balanced datasets. However, it struggles to learn well-conditioned representations of datasets with long-tailed class distributions. This problem is potentially exacerbated for binary imbalanced distributions, which are commonly encountered during many real-world problems such as medical diagnosis. In experiments on seven binary datasets of natural and medical images, we show that the performance of SupCon decreases with increasing class imbalance. To substantiate these findings, we introduce two novel metrics that evaluate the quality of the learned representation space. By measuring the class distribution in local neighborhoods, we are able to uncover structural deficiencies of the representation space that classical metrics cannot detect. Informed by these insights, we propose two new supervised contrastive learning strategies tailored to binary imbalanced datasets that improve the structure of the representation space and increase downstream classification accuracy over standard SupCon by up to 35%. We make our code available.

MCML Authors

[57]
J. Kaiser, J. Eigenmann, D. Rückert and G. Kaissis.
User-Level Differential Privacy in Medical Machine Learning.
TPDP 2025 - Workshop on Theory and Practice of Differential Privacy. Google, Mountain View, CA, USA, Jun 02-03, 2025. PDF
Abstract

We address the challenge of ensuring user-level DP when individuals contribute varying numbers of data records to a dataset. While group privacy can be used to aggregate record-level budgets, it can be overly pessimistic and lacks flexibility when users contribute varying numbers of data points. We propose a method for accounting for arbitrary numbers of records per user while maintaining a fixed per-user privacy guarantee by leveraging individual privacy assignment. Experimentally, our method yields excellent utility comparable to record-level DP while providing a more meaningful/interpretable protection.

MCML Authors

[56] Top Journal
C. S. Vetter, A. Bender, D. B. Dwyer, M. Montembeault, A. Ruef, K. Chrisholm, L. Kambeitz-Ilankovic, L. A. Antonucci, S. Ruhrmann, J. Kambeitz, M. Lichtenstein, A. Riecher, R. Upthegrove, R. K. R. Salokangas, J. Hietala, C. Pantelis, R. Lencer, E. Meisenzahl, S. Wood, P. Brambilla, S. Borgwardt, P. Falkai, A. Bertolino, N. Koutsouleris and PRONIA Consortium.
Exploring the Predictive Value of Structural Covariance Networks for the Diagnosis of Schizophrenia.
Frontiers in Psychiatry 16 (Jun. 2025). DOI
Abstract

Schizophrenia is a psychiatric disorder hypothesized to result from disturbed brain connectivity. Structural covariance networks (SCN) describe the shared variation in morphological properties emerging from coordinated neurodevelopmental processes and may, thus, be a promising diagnostic biomarker for schizophrenia.We compared the diagnostic value of two SCN computation methods derived from regional gray matter volume (GMV) in 154 patients with a diagnosis of first episode psychosis or recurrent schizophrenia (PAT) and 366 healthy control individuals (HC). The first method (REF-SCN) quantifies the contribution of an individual to a normative reference group’s SCN, and the second approach (KLS-SCN) uses a symmetric version of Kulback-Leibler divergence. Their diagnostic value compared to regional GMV was assessed in a stepwise analysis using a series of linear support vector machines within a nested cross-validation framework and stacked generalization, all models were externally validated in an independent sample (NPAT=71, NHC=74), SCN feature importance was assessed, and the derived risk scores were analyzed for differential relationships with clinical variables.We found that models trained on SCNs were able to classify patients with schizophrenia and combining SCNs and regional GMV in a stacked model improved training (balanced accuracy (BAC)=69.96%) and external validation performance (BAC=67.10%). Among all unimodal models, the highest discovery sample performance was achieved by a model trained on REF-SCN (balanced accuracy (BAC=67.03%). All model decisions were driven by widespread structural covariance alterations involving the somato-motor, default mode, control, visual, and the ventral attention networks. Risk estimates derived from KLS-SCNs and regional GMV, but not REF-SCNs, could be predicted from clinical variables, especially driven by body mass index (BMI) and affect-related negative symptoms. These patterns of results show that different SCN computation approaches capture different aspects of the disease. While REF-SCNs contain valuable information for discriminating schizophrenia from healthy control individuals, KLS-SCNs may capture more nuanced symptom-level characteristics similar to those captured by PCA of regional GMV.

MCML Authors

[55]
C. J. Mertens, H. Häntze, S. Ziegelmayer, J. N. Kather, D. Truhn, S. H. Kim, F. Busch, D. Weller, B. Wiestler, M. Graf, F. Bamberg, C. L. Schlett, J. B. Weiss, S. Ringhof, E. Can, J. Schulz-Menger, T. Niendorf, J. Lammert, I. Molwitz, A. Kader, A. Hering, A. Meddeb, J. Nawabi, M. B. Schulze, T. Keil, S. N. Willich, L. Krist, M. Hadamitzky, A. Hannemann, F. Bassermann, D. Rückert, T. Pischon, A. Hapfelmeier, M. R. Makowski, K. K. Bressem and L. C. Adams.
Deep learning-enabled MRI phenotyping uncovers regional body composition heterogeneity and disease associations in two European population cohorts.
Preprint (Jun. 2025). DOI
Abstract

Body mass index (BMI) does not account for substantial inter-individual differences in regional fat and muscle compartments, which are relevant for the prevalence of cardiometabolic and cancer conditions. We applied a validated deep learning pipeline for automated segmentation of whole-body MRI scans in 45,851 adults from the UK Biobank and German National Cohort, enabling harmonized quantification of visceral (VAT), gluteofemoral (GFAT), and abdominal subcutaneous adipose tissue (ASAT), liver fat fraction (LFF), and trunk muscle volume. Associations with clinical conditions were evaluated using compartment measures adjusted for age, sex, height, and BMI. Our analysis demonstrates that regional adiposity and muscle volume show distinct associations with cardiometabolic and cancer prevalence, and that substantial disease heterogeneity exists within BMI strata. The analytic framework and reference data presented here will support future risk stratification efforts and facilitate the integration of automated MRI phenotyping into large-scale population and clinical research.

MCML Authors
Link to Profile Benedikt Wiestler

Benedikt Wiestler

Prof. Dr.

Principal Investigator


[54]
A. H. Berger, L. Lux, A. Weers, M. Menten, D. Rückert and J. C. Paetzold.
Pitfalls of topology-aware image segmentation.
IPMI 2025 - Information Processing in Medical Imaging. Kos Island, Greece, May 25-30, 2025. DOI
Abstract

Topological correctness, i.e., the preservation of structural integrity and specific characteristics of shape, is a fundamental requirement for medical imaging tasks, such as neuron or vessel segmentation. Despite the recent surge in topology-aware methods addressing this challenge, their real-world applicability is hindered by flawed benchmarking practices. In this paper, we identify critical pitfalls in model evaluation that include inadequate connectivity choices, overlooked topological artifacts in ground truth annotations, and inappropriate use of evaluation metrics. Through detailed empirical analysis, we uncover these issues’ profound impact on the evaluation and ranking of segmentation methods. Drawing from our findings, we propose a set of actionable recommendations to establish fair and robust evaluation standards for topology-aware medical image segmentation methods.

MCML Authors

[53]
S. Lockfisch, K. Schwethelm, M. Menten, R. Braren, D. Rückert, A. Ziller and G. Kaissis.
On Arbitrary Predictions from Equally Valid Models.
Preprint (May. 2025). arXiv
Abstract

Model multiplicity refers to the existence of multiple machine learning models that describe the data equally well but may produce different predictions on individual samples. In medicine, these models can admit conflicting predictions for the same patient – a risk that is poorly understood and insufficiently addressed.
In this study, we empirically analyze the extent, drivers, and ramifications of predictive multiplicity across diverse medical tasks and model architectures, and show that even small ensembles can mitigate/eliminate predictive multiplicity in practice. Our analysis reveals that (1) standard validation metrics fail to identify a uniquely optimal model and (2) a substantial amount of predictions hinges on arbitrary choices made during model development. Using multiple models instead of a single model reveals instances where predictions differ across equally plausible models – highlighting patients that would receive arbitrary diagnoses if any single model were used. In contrast, (3) a small ensemble paired with an abstention strategy can effectively mitigate measurable predictive multiplicity in practice; predictions with high inter-model consensus may thus be amenable to automated classification. While accuracy is not a principled antidote to predictive multiplicity, we find that (4) higher accuracy achieved through increased model capacity reduces predictive multiplicity. Our findings underscore the clinical importance of accounting for model multiplicity and advocate for ensemble-based strategies to improve diagnostic reliability. In cases where models fail to reach sufficient consensus, we recommend deferring decisions to expert review.

MCML Authors

[52] A* Conference
S. Dahan, G. Bénédict, L. Z. J. Williams, Y. Guo, D. Rückert, R. Leech and E. C. Robinson.
SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. URL GitHub
Abstract

Current AI frameworks for brain decoding and encoding, typically train and test models within the same datasets. This limits their utility for brain computer interfaces (BCI) or neurofeedback, for which it would be useful to pool experiences across individuals to better simulate stimuli not sampled during training. A key obstacle to model generalisation is the degree of variability of inter-subject cortical organisation, which makes it difficult to align or compare cortical signals across participants. In this paper we address this through the use of surface vision transformers, which build a generalisable model of cortical functional dynamics, through encoding the topography of cortical networks and their interactions as a moving image across a surface. This is then combined with tri-modal self-supervised contrastive (CLIP) alignment of audio, video, and fMRI modalities to enable the retrieval of visual and auditory stimuli from patterns of cortical activity (and vice-versa). We validate our approach on 7T task-fMRI data from 174 healthy participants engaged in the movie-watching experiment from the Human Connectome Project (HCP). Results show that it is possible to detect which movie clips an individual is watching purely from their brain activity, even for individuals and movies not seen during training. Further analysis of attention maps reveals that our model captures individual patterns of brain activity that reflect semantic and visual systems. This opens the door to future personalised simulations of brain function.

MCML Authors

[51] A* Conference
J. Kaiser, K. Schwethelm, D. Rückert and G. Kaissis.
Laplace Sample Information: Data Informativeness Through a Bayesian Lens.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. URL
Abstract

Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose Laplace Sample Information (LSI) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings. LSI leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset. We experimentally show that LSI is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty. We demonstrate these capabilities of LSI on image and text data in supervised and unsupervised settings. Moreover, we show that LSI can be computed efficiently through probes and transfers well to the training of large models.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate


[50] A* Conference
L. Lux, A. H. Berger, A. Weers, N. Stucki, D. Rückert, U. Bauer and J. C. Paetzold.
Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. URL
Abstract

Topological correctness plays a critical role in many image segmentation tasks, yet most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy. Existing topology-aware methods often lack robust topological guarantees, are limited to specific use cases, or impose high computational costs. In this work, we propose a novel, graph-based framework for topologically accurate image segmentation that is both computationally efficient and generally applicable. Our method constructs a component graph that fully encodes the topological information of both the prediction and ground truth, allowing us to efficiently identify topologically critical regions and aggregate a loss based on local neighborhood information. Furthermore, we introduce a strict topological metric capturing the homotopy equivalence between the union and intersection of prediction-label pairs. We formally prove the topological guarantees of our approach and empirically validate its effectiveness on binary and multi-class datasets. Our loss demonstrates state-of-the-art performance with up to fivefold faster loss computation compared to persistent homology methods.

MCML Authors

[49] Top Journal
Ö. Turgut, P. Müller, P. Hager, S. Shit, S. Starck, M. Menten, E. Martens and D. Rückert.
Unlocking the diagnostic potential of electrocardiograms through information transfer from cardiac magnetic resonance imaging.
Medical Image Analysis 101.103451 (Apr. 2025). DOI GitHub
Abstract

Cardiovascular diseases (CVD) can be diagnosed using various diagnostic modalities. The electrocardiogram (ECG) is a cost-effective and widely available diagnostic aid that provides functional information of the heart. However, its ability to classify and spatially localise CVD is limited. In contrast, cardiac magnetic resonance (CMR) imaging provides detailed structural information of the heart and thus enables evidence-based diagnosis of CVD, but long scan times and high costs limit its use in clinical routine. In this work, we present a deep learning strategy for cost-effective and comprehensive cardiac screening solely from ECG. Our approach combines multimodal contrastive learning with masked data modelling to transfer domain-specific information from CMR imaging to ECG representations. In extensive experiments using data from 40,044 UK Biobank subjects, we demonstrate the utility and generalisability of our method for subject-specific risk prediction of CVD and the prediction of cardiac phenotypes using only ECG data. Specifically, our novel multimodal pre-training paradigm improves performance by up to 12.19% for risk prediction and 27.59% for phenotype prediction. In a qualitative analysis, we demonstrate that our learned ECG representations incorporate information from CMR image regions of interest.

MCML Authors

[48]
K. Schwethelm, J. Kaiser, M. Knolle, S. Lockfisch, D. Rückert and A. Ziller.
Visual Privacy Auditing with Diffusion Models.
Transactions on Machine Learning Research (Mar. 2025). URL
Abstract

Data reconstruction attacks on machine learning models pose a substantial threat to privacy, potentially leaking sensitive information. Although defending against such attacks using differential privacy (DP) provides theoretical guarantees, determining appropriate DP parameters remains challenging. Current formal guarantees on the success of data reconstruction suffer from overly stringent assumptions regarding adversary knowledge about the target data, particularly in the image domain, raising questions about their real-world applicability. In this work, we empirically investigate this discrepancy by introducing a reconstruction attack based on diffusion models (DMs) that only assumes adversary access to real-world image priors and specifically targets the DP defense. We find that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as heuristic auditing tools for visualizing privacy leakage.

MCML Authors

[47]
M. Hartenberger, H. Ayaz, F. Ozlugedik, C. Caredda, L. Giannoni, F. Lange, L. Lux, J. Weidner, A. Berger, F. Kofler, M. Menten, B. Montcel, I. Tachtsidis, D. Rückert and I. Ezhov.
Redefining spectral unmixing for in-vivo brain tissue analysis from hyperspectral imaging.
Preprint (Mar. 2025). arXiv
Abstract

In this paper, we propose a methodology for extracting molecular tumor biomarkers from hyperspectral imaging (HSI), an emerging technology for intraoperative tissue assessment. To achieve this, we employ spectral unmixing, allowing to decompose the spectral signals recorded by the HSI camera into their constituent molecular components. Traditional unmixing approaches are based on physical models that establish a relationship between tissue molecules and the recorded spectra. However, these methods commonly assume a linear relationship between the spectra and molecular content, which does not capture the whole complexity of light-matter interaction. To address this limitation, we introduce a novel unmixing procedure that allows to take into account non-linear optical effects while preserving the computational benefits of linear spectral unmixing. We validate our methodology on an in-vivo brain tissue HSI dataset and demonstrate that the extracted molecular information leads to superior classification performance.

MCML Authors

[46]
V. Sideri-Lampretsa, D. Rückert and H. Qiu.
Evaluation of Alignment-Regularity Characteristics in Deformable Image Registration.
Preprint (Mar. 2025). arXiv
Abstract

Evaluating deformable image registration (DIR) is challenging due to the inherent trade-off between achieving high alignment accuracy and maintaining deformation regularity. In this work, we introduce a novel evaluation scheme based on the alignment-regularity characteristic (ARC) to systematically capture and analyze this trade-off. We first introduce the ARC curves, which describe the performance of a given registration algorithm as a spectrum measured by alignment and regularity metrics. We further adopt a HyperNetwork-based approach that learns to continuously interpolate across the full regularization range, accelerating the construction and improving the sample density of ARC curves. We empirically demonstrate our evaluation scheme using representative learning-based deformable image registration methods with various network architectures and transformation models on two public datasets. We present a range of findings not evident from existing evaluation practices and provide general recommendations for model evaluation and selection using our evaluation scheme. All code relevant is made publicly available.

MCML Authors

[45]
A. Weers, A. H. Berger, L. Lux, P. Schüffler, D. Rückert and J. C. Paetzold.
From Pixels to Histopathology: A Graph-Based Framework for Interpretable Whole Slide Image Analysis.
Preprint (Mar. 2025). arXiv GitHub
Abstract

The histopathological classification of whole-slide images (WSIs) is a fundamental task in digital pathology; yet it requires extensive time and expertise from specialists. While deep learning methods show promising results, they typically process WSIs by dividing them into artificial patches, which inherently prevents a network from learning from the entire image context, disregards natural tissue structures and compromises interpretability. Our method overcomes this limitation through a novel graph-based framework that constructs WSI graph representations. The WSI-graph efficiently captures essential histopathological information in a compact form. We build tissue representations (nodes) that follow biological boundaries rather than arbitrary patches all while providing interpretable features for explainability. Through adaptive graph coarsening guided by learned embeddings, we progressively merge regions while maintaining discriminative local features and enabling efficient global information exchange. In our method’s final step, we solve the diagnostic task through a graph attention network. We empirically demonstrate strong performance on multiple challenging tasks such as cancer stage classification and survival prediction, while also identifying predictive factors using Integrated Gradients.

MCML Authors

[44] A Conference
A. H. Berger, L. Lux, S. Shit, I. Ezhov, G. Kaissis, M. Menten, D. Rückert and J. C. Paetzold.
Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers.
WACV 2025 - IEEE/CVF Winter Conference on Applications of Computer Vision. Tucson, AZ, USA, Feb 28-Mar 04, 2025. DOI
Abstract

Direct image-to-graph transformation is a challenging task that involves solving object detection and relationship prediction in a single model. Due to this task’s complexity, large training datasets are rare in many domains, making the training of deep-learning methods challenging. This data sparsity necessitates transfer learning strategies akin to the state-of-the-art in general computer vision. In this work, we introduce a set of methods enabling cross-domain and cross-dimension learning for image-to-graph transformers. We propose (1) a regularized edge sampling loss to effectively learn object relations in multiple domains with different numbers of edges, (2) a domain adaptation framework for image-to-graph transformers aligning image- and graph-level features from different domains, and (3) a projection function that allows using 2D data for training 3D transformers. We demonstrate our method’s utility in cross-domain and cross-dimension experiments, where we utilize labeled data from 2D road networks for simultaneous learning in vastly different target domains. Our method consistently outperforms standard transfer learning and self-supervised pretraining on challenging benchmarks, such as retinal or whole-brain vessel graph extraction.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate

Link to Profile Martin Menten

Martin Menten

Dr.

JRG Leader AI for Vision


[43] Top Journal
C. I. Bercea, B. Wiestler, D. Rückert and J. A. Schnabel.
Evaluating normative representation learning in generative AI for robust anomaly detection in brain imaging.
Nature Communications 16.1624 (Feb. 2025). DOI GitHub
Abstract

Normative representation learning focuses on understanding the typical anatomical distributions from large datasets of medical scans from healthy individuals. Generative Artificial Intelligence (AI) leverages this attribute to synthesize images that accurately reflect these normative patterns. This capability enables the AI allowing them to effectively detect and correct anomalies in new, unseen pathological data without the need for expert labeling. Traditional anomaly detection methods often evaluate the anomaly detection performance, overlooking the crucial role of normative learning. In our analysis, we introduce novel metrics, specifically designed to evaluate this facet in AI models. We apply these metrics across various generative AI frameworks, including advanced diffusion models, and rigorously test them against complex and diverse brain pathologies. In addition, we conduct a large multi-reader study to compare these metrics to experts’ evaluations. Our analysis demonstrates that models proficient in normative learning exhibit exceptional versatility, adeptly detecting a wide range of unseen medical conditions.

MCML Authors
Link to Profile Benedikt Wiestler

Benedikt Wiestler

Prof. Dr.

Principal Investigator

Link to Profile Julia Schnabel

Julia Schnabel

Prof. Dr.

Principal Investigator


[42]
Ö. Turgut, F. S. Bott, M. Ploner and D. Rückert.
Are foundation models useful feature extractors for electroencephalography analysis?
Preprint (Feb. 2025). arXiv
Abstract

The success of foundation models in natural language processing and computer vision has motivated similar approaches for general time series analysis. While these models are effective for a variety of tasks, their applicability in medical domains with limited data remains largely unexplored. To address this, we investigate the effectiveness of foundation models in medical time series analysis involving electroencephalography (EEG). Through extensive experiments on tasks such as age prediction, seizure detection, and the classification of clinically relevant EEG events, we compare their diagnostic accuracy with that of specialised EEG models. Our analysis shows that foundation models extract meaningful EEG features, outperform specialised models even without domain adaptation, and localise task-specific biomarkers. Moreover, we demonstrate that diagnostic accuracy is substantially influenced by architectural choices such as context length. Overall, our study reveals that foundation models with general time series understanding eliminate the dependency on large domain-specific datasets, making them valuable tools for clinical practice.

MCML Authors

[41]
F. Drexel, V. Sideri-Lampretsa, H. Bast, A. W. Marka, T. Koehler, F. T. Gassert, D. Pfeiffer, D. Rückert and F. Pfeiffer.
Deformable Image Registration of Dark-Field Chest Radiographs for Local Lung Signal Change Assessment.
Preprint (Jan. 2025). arXiv
Abstract

Dark-field radiography of the human chest has been demonstrated to have promising potential for the analysis of the lung microstructure and the diagnosis of respiratory diseases. However, previous studies of dark-field chest radiographs evaluated the lung signal only in the inspiratory breathing state. Our work aims to add a new perspective to these previous assessments by locally comparing dark-field lung information between different respiratory states. To this end, we discuss suitable image registration methods for dark-field chest radiographs to enable consistent spatial alignment of the lung in distinct breathing states. Utilizing full inspiration and expiration scans from a clinical chronic obstructive pulmonary disease study, we assess the performance of the proposed registration framework and outline applicable evaluation approaches. Our regional characterization of lung dark-field signal changes between the breathing states provides a proof-of-principle that dynamic radiography-based lung function assessment approaches may benefit from considering registered dark-field images in addition to standard plain chest radiographs.

MCML Authors

[40]
Z. Haouari, J. Weidner, I. Ezhov, A. Varma, D. Rückert, B. Menze and B. Wiestler.
Efficient Deep Learning-based Forward Solvers for Brain Tumor Growth Models.
Preprint (Jan. 2025). arXiv
Abstract

Glioblastoma, a highly aggressive brain tumor, poses major challenges due to its poor prognosis and high morbidity rates. Partial differential equation-based models offer promising potential to enhance therapeutic outcomes by simulating patient-specific tumor behavior for improved radiotherapy planning. However, model calibration remains a bottleneck due to the high computational demands of optimization methods like Monte Carlo sampling and evolutionary algorithms. To address this, we recently introduced an approach leveraging a neural forward solver with gradient-based optimization to significantly reduce calibration time. This approach requires a highly accurate and fully differentiable forward model. We investigate multiple architectures, including (i) an enhanced TumorSurrogate, (ii) a modified nnU-Net, and (iii) a 3D Vision Transformer (ViT). The optimized TumorSurrogate achieved the best overall results, excelling in both tumor outline matching and voxel-level prediction of tumor cell concentration. It halved the MSE relative to the baseline model and achieved the highest Dice score across all tumor cell concentration thresholds. Our study demonstrates significant enhancement in forward solver performance and outlines important future research directions.

MCML Authors

[39]
B. Jian, J. Pan, Y. Li, F. Bongratz, R. Li, D. Rückert, B. Wiestler and C. Wachinger.
TimeFlow: Longitudinal Brain Image Registration and Aging Progression Analysis.
Preprint (Jan. 2025). arXiv
Abstract

Predicting future brain states is crucial for understanding healthy aging and neurodegenerative diseases. Longitudinal brain MRI registration, a cornerstone for such analyses, has long been limited by its inability to forecast future developments, reliance on extensive, dense longitudinal data, and the need to balance registration accuracy with temporal smoothness. In this work, we present emph{TimeFlow}, a novel framework for longitudinal brain MRI registration that overcomes all these challenges. Leveraging a U-Net architecture with temporal conditioning inspired by diffusion models, TimeFlow enables accurate longitudinal registration and facilitates prospective analyses through future image prediction. Unlike traditional methods that depend on explicit smoothness regularizers and dense sequential data, TimeFlow achieves temporal consistency and continuity without these constraints. Experimental results highlight its superior performance in both future timepoint prediction and registration accuracy compared to state-of-the-art methods. Additionally, TimeFlow supports novel biological brain aging analyses, effectively differentiating neurodegenerative conditions from healthy aging. It eliminates the need for segmentation, thereby avoiding the challenges of non-trivial annotation and inconsistent segmentation errors. TimeFlow paves the way for accurate, data-efficient, and annotation-free prospective analyses of brain aging and chronic diseases.

MCML Authors

2024


[38]
A. Reithmeir, V. Spieker, V. Sideri-Lampretsa, D. Rückert, J. A. Schnabel and V. A. Zimmer.
From Model Based to Learned Regularization in Medical Image Registration: A Comprehensive Review.
Preprint (Dec. 2024). arXiv
Abstract

Image registration is fundamental in medical imaging applications, such as disease progression analysis or radiation therapy planning. The primary objective of image registration is to precisely capture the deformation between two or more images, typically achieved by minimizing an optimization problem. Due to its inherent ill-posedness, regularization is a key component in driving the solution toward anatomically meaningful deformations. A wide range of regularization methods has been proposed for both conventional and deep learning-based registration. However, the appropriate application of regularization techniques often depends on the specific registration problem, and no one-fits-all method exists. Despite its importance, regularization is often overlooked or addressed with default approaches, assuming existing methods are sufficient. A comprehensive and structured review remains missing. This review addresses this gap by introducing a novel taxonomy that systematically categorizes the diverse range of proposed regularization methods. It highlights the emerging field of learned regularization, which leverages data-driven techniques to automatically derive deformation properties from the data. Moreover, this review examines the transfer of regularization methods from conventional to learning-based registration, identifies open challenges, and outlines future research directions. By emphasizing the critical role of regularization in image registration, we hope to inspire the research community to reconsider regularization strategies in modern registration algorithms and to explore this rapidly evolving field further.

MCML Authors

[37]
J. Weidner, M. Balcerak, I. Ezhov, A. Datchev, L. Lux, L. Zimmer, D. Rückert, B. Menze and B. Wiestler.
Spatial Brain Tumor Concentration Estimation for Individualized Radiotherapy Planning.
Preprint (Dec. 2024). arXiv
Abstract

Biophysical modeling of brain tumors has emerged as a promising strategy for personalizing radiotherapy planning by estimating the otherwise hidden distribution of tumor cells within the brain. However, many existing state-of-the-art methods are computationally intensive, limiting their widespread translation into clinical practice. In this work, we propose an efficient and direct method that utilizes soft physical constraints to estimate the tumor cell concentration from preoperative MRI of brain tumor patients. Our approach optimizes a 3D tumor concentration field by simultaneously minimizing the difference between the observed MRI and a physically informed loss function. Compared to existing state-of-the-art techniques, our method significantly improves predicting tumor recurrence on two public datasets with a total of 192 patients while maintaining a clinically viable runtime of under one minute - a substantial reduction from the 30 minutes required by the current best approach. Furthermore, we showcase the generalizability of our framework by incorporating additional imaging information and physical constraints, highlighting its potential to translate to various medical diffusion phenomena with imperfect data.

MCML Authors

[36]
M. Szép, D. Rückert, R. Eisenhart-Rothe and F. Hinterwimmer.
A Practical Guide to Fine-tuning Language Models with Limited Data.
Preprint (Nov. 2024). arXiv
Abstract

Employing pre-trained Large Language Models (LLMs) has become the de facto standard in Natural Language Processing (NLP) despite their extensive data requirements. Motivated by the recent surge in research focused on training LLMs with limited data, particularly in low-resource domains and languages, this paper surveys recent transfer learning approaches to optimize model performance in downstream tasks where data is scarce. We first address initial and continued pre-training strategies to better leverage prior knowledge in unseen domains and languages. We then examine how to maximize the utility of limited data during fine-tuning and few-shot learning. The final section takes a task-specific perspective, reviewing models and methods suited for different levels of data scarcity. Our goal is to provide practitioners with practical guidelines for overcoming the challenges posed by constrained data while also highlighting promising directions for future research.

MCML Authors

[35] A Conference
A. Riess, A. Ziller, S. Kolek, D. Rückert, J. A. Schnabel and G. Kaissis.
Complex-Valued Federated Learning with Differential Privacy and MRI Applications.
DeCaF @MICCAI 2024 - 5th Workshop on Distributed, Collaborative and Federated Learning at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. DOI
Abstract

Federated learning enhanced with Differential Privacy (DP) is a powerful privacy-preserving strategy to protect individuals sharing their sensitive data for processing in fields such as medicine and healthcare. Many medical applications, for example magnetic resonance imaging (MRI), rely on complex-valued signal processing techniques for data acquisition and analysis. However, the appropriate application of DP to complex-valued data is still underexplored. To address this issue, from the theoretical side, we introduce the complex-valued Gaussian mechanism, whose behaviour we characterise in terms of f-DP, -DP and Rényi-DP. Moreover, we generalise the fundamental algorithm DP stochastic gradient descent to complex-valued neural networks and present novel complex-valued neural network primitives compatible with DP. Experimentally, we showcase a proof-of-concept by training federated complex-valued neural networks with DP on a real-world task (MRI pulse sequence classification in k-space), yielding excellent utility and privacy. Our results highlight the relevance of combining federated learning with robust privacy-preserving techniques in the MRI context.

MCML Authors
Link to Profile Julia Schnabel

Julia Schnabel

Prof. Dr.

Principal Investigator

Georgios Kaissis

Dr.

Associate

* Former Associate


[34] A Conference
A. Banaszak, A. H. Berger, L. Lux, S. Shit, D. Rückert and J. C. Paetzold.
Supervised Contrastive Learning for Image-to-Graph Transformers.
GRAIL @MICCAI 2024 - 6th Workshop on GRaphs in biomedicAl Image anaLysis at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. DOI
Abstract

Image-to-graph transformers can effectively encode image information in graphs but are typically difficult to train and require large annotated datasets. Contrastive learning can increase data efficiency by enhancing feature representations, but existing methods are not applicable to graph labels because they operate on categorical label spaces. In this work, we propose a method enabling supervised contrastive learning for image-to-graph transformers. We introduce two supervised contrastive loss formulations based on graph similarity between label pairs that we approximate using a graph neural network. Our approach avoids tailored data augmentation techniques and can be easily integrated into existing training pipelines. We perform multiple empirical studies showcasing performance improvements across various metrics.

MCML Authors

[33] A Conference
L. Lux, A. H. Berger, M. Romeo-Tricas, M. Menten, D. Rückert and J. C. Paetzold.
Exploring Graphs as Data Representation for Disease Classification in Ophthalmology.
GRAIL @MICCAI 2024 - 6th Workshop on GRaphs in biomedicAl Image anaLysis at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. DOI URL
Abstract

Interpretability, particularly in terms of human understandable concepts, is essential for building trust in machine learning models for disease classification. However, state-of-the-art image classifiers exhibit limited interpretability, posing a significant barrier to their acceptance in clinical practice. To address this, our work introduces two graph representations of the retinal vasculature, aiming to bridge the gap between high-performance classifiers and human-understandable interpretability concepts in ophthalmology. We use these graphs with the aim of training graph neural networks (GNNs) for disease staging. First, we formally and experimentally show that GNNs can learn known clinical biomarkers. In that, we show that GNNs can learn human interpretable concepts. Next, we train GNNs for disease staging and study how different aggregation strategies lead the GNN to learn more and less human interpretable features. Finally, we propose a visualization for integrated gradients on graphs, which allows us to identify if GNN models have learned human-understandable representations of the data.

MCML Authors

[32] A Conference
A. H. Berger, L. Lux, N. Stucki, V. Bürgin, S. Shit, A. Banaszaka, D. Rückert, U. Bauer and J. C. Paetzold.
Topologically faithful multi-class segmentation in medical images.
MICCAI 2024 - 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. DOI
Abstract

Topological accuracy in medical image segmentation is a highly important property for downstream applications such as network analysis and flow modeling in vessels or cell counting. Recently, significant methodological advancements have brought well-founded concepts from algebraic topology to binary segmentation. However, these approaches have been underexplored in multi-class segmentation scenarios, where topological errors are common. We propose a general loss function for topologically faithful multi-class segmentation extending the recent Betti matching concept, which is based on induced matchings of persistence barcodes. We project the N-class segmentation problem to N single-class segmentation tasks, which allows us to use 1-parameter persistent homology, making training of neural networks computationally feasible. We validate our method on a comprehensive set of four medical datasets with highly variant topological characteristics. Our loss formulation significantly enhances topological correctness in cardiac, cell, artery-vein, and Circle of Willis segmentation.

MCML Authors

[31] A Conference
J. Li, S. H. Kim, P. Müller, L. Felsner, D. Rückert, B. Wiestler, J. A. Schnabel and C. I. Bercea.
Language Models Meet Anomaly Detection for Better Interpretability and Generalizability.
MMMI @MICCAI 2024 - 5th International Workshop on Multiscale Multimodal Medical Imaging at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. DOI GitHub
Abstract

This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model’s generalizability to previously unseen medical conditions.

MCML Authors

[30] A Conference
B. Jian, J. Pan, M. Ghahremani, D. Rückert, C. Wachinger and B. Wiestler.
Mamba? Catch The Hype Or Rethink What Really Helps for Image Registration.
WBIR @MICCAI 2024 - 11th International Workshop on Biomedical Image Registration at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. DOI
Abstract

VoxelMorph, proposed in 2018, utilizes Convolutional Neural Networks (CNNs) to address medical image registration problems. In 2021 TransMorph advanced this approach by replacing CNNs with Attention mechanisms, claiming enhanced performance. More recently, the rise of Mamba with selective state space models has led to MambaMorph, which substituted Attention with Mamba blocks, asserting superior registration. These developments prompt a critical question: does chasing the latest computational trends with “more advanced” computational blocks genuinely enhance registration accuracy, or is it merely hype? Furthermore, the role of classic high-level registration-specific designs, such as coarse-to-fine pyramid mechanism, correlation calculation, and iterative optimization, warrants scrutiny, particularly in differentiating their influence from the aforementioned low-level computational blocks. In this study, we critically examine these questions through a rigorous evaluation in brain MRI registration. We employed modularized components for each block and ensured unbiased comparisons across all methods and designs to disentangle their effects on performance. Our findings indicate that adopting “advanced” computational elements fails to significantly improve registration accuracy. Instead, well-established registration-specific designs offer fair improvements, enhancing results by a marginal 1.5% over the baseline. Our findings emphasize the importance of rigorous, unbiased evaluation and contribution disentanglement of all low- and high-level registration components, rather than simply following the computer vision trends with “more advanced” computational blocks. We advocate for simpler yet effective solutions and novel evaluation metrics that go beyond conventional registration accuracy, warranting further research across various organs and modalities.

MCML Authors

[29] A Conference
F. Kögl, A. Reithmeir, V. Sideri-Lampretsa, I. Machado, R. Braren, D. Rückert, J. A. Schnabel and V. A. Zimmer.
General Vision Encoder Features as Guidance in Medical Image Registration.
WBIR @MICCAI 2024 - 11th International Workshop on Biomedical Image Registration at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention. Marrakesh, Morocco, Oct 06-10, 2024. DOI URL
Abstract

General vision encoders like DINOv2 and SAM have recently transformed computer vision. Even though they are trained on natural images, such encoder models have excelled in medical imaging, e.g., in classification, segmentation, and registration. However, no in-depth comparison of different state-of-the-art general vision encoders for medical registration is available. In this work, we investigate how well general vision encoder features can be used in the dissimilarity metrics for medical image registration. We explore two encoders that were trained on natural images as well as one that was fine-tuned on medical data. We apply the features within the well-established B-spline FFD registration framework. In extensive experiments on cardiac cine MRI data, we find that using features as additional guidance for conventional metrics improves the registration quality.

MCML Authors

[28] A* Conference
P. Müller, G. Kaissis and D. Rückert.
ChEX: Interactive Localization and Region Description in Chest X-rays.
ECCV 2024 - 18th European Conference on Computer Vision. Milano, Italy, Sep 29-Oct 04, 2024. DOI GitHub
Abstract

Report generation models offer fine-grained textual interpretations of medical images like chest X-rays, yet they often lack interactivity (i.e. the ability to steer the generation process through user queries) and localized interpretability (i.e. visually grounding their predictions), which we deem essential for future adoption in clinical practice. While there have been efforts to tackle these issues, they are either limited in their interactivity by not supporting textual queries or fail to also offer localized interpretability. Therefore, we propose a novel multitask architecture and training paradigm integrating textual prompts and bounding boxes for diverse aspects like anatomical regions and pathologies. We call this approach the Chest X-Ray Explainer (ChEX). Evaluations across a heterogeneous set of 9 chest X-ray tasks, including localized image interpretation and report generation, showcase its competitiveness with SOTA models while additional analysis demonstrates ChEX’s interactive capabilities.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate


[27] Top Journal
A. C. Erdur, D. Rusche, D. Scholz, J. Kiechle, S. Fischer, Ó. Llorián-Salvador, J. A. Buchner, M. Q. Nguyen, L. Etzel, J. Weidner, M.-C. Metz, B. Wiestler, J. A. Schnabel, D. Rückert, S. E. Combs and J. C. Peeken.
Deep learning for autosegmentation for radiotherapy treatment planning: State-of-the-art and novel perspectives.
Strahlentherapie und Onkologie 201 (Aug. 2024). DOI GitHub
Abstract

The rapid development of artificial intelligence (AI) has gained importance, with many tools already entering our daily lives. The medical field of radiation oncology is also subject to this development, with AI entering all steps of the patient journey. In this review article, we summarize contemporary AI techniques and explore the clinical applications of AI-based automated segmentation models in radiotherapy planning, focusing on delineation of organs at risk (OARs), the gross tumor volume (GTV), and the clinical target volume (CTV). Emphasizing the need for precise and individualized plans, we review various commercial and freeware segmentation tools and also state-of-the-art approaches. Through our own findings and based on the literature, we demonstrate improved efficiency and consistency as well as time savings in different clinical scenarios. Despite challenges in clinical implementation such as domain shifts, the potential benefits for personalized treatment planning are substantial. The integration of mathematical tumor growth models and AI-based tumor detection further enhances the possibilities for refining target volumes. As advancements continue, the prospect of one-stop-shop segmentation and radiotherapy planning represents an exciting frontier in radiotherapy, potentially enabling fast treatment with enhanced precision and individualization.

MCML Authors

[26] A* Conference
G. Kaissis, S. Kolek, B. Balle, J. Hayes and D. Rückert.
Beyond the Calibration Point: Mechanism Comparison in Differential Privacy.
ICML 2024 - 41st International Conference on Machine Learning. Vienna, Austria, Jul 21-27, 2024. URL
Abstract

In differentially private (DP) machine learning, the privacy guarantees of DP mechanisms are often reported and compared on the basis of a single pε,δq-pair. This practice overlooks that DP guarantees can vary substantially even between mechanisms sharing a given pε,δq, and potentially introduces privacy vulnerabilities which can remain undetected. This motivates the need for robust, rigorous methods for comparing DP guarantees in such cases. Here, we introduce the ∆-divergence between mechanisms which quantifies the worst-case excess privacy vulnerability of choosing one mechanism over another in terms of pε,δq, f-DP and in terms of a newly presented Bayesian interpretation. Moreover, as a generalisation of the Blackwell theorem, it is endowed with strong decision-theoretic foundations. Through application examples, we show that our techniques can facilitate informed decision-making and reveal gaps in the current understanding of privacy risks, as current practices in DP-SGD often result in choosing mechanisms with high excess privacy vulnerabilities.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate


[25]
M. Keicher, K. Zaripova, T. Czempiel, K. Mach, A. Khakzar and N. Navab.
FlexR: Few-shot Classification with Language Embeddings for Structured Reporting of Chest X-rays.
MIDL 2024 - Medical Imaging with Deep Learning. Paris, France, Jul 03-05, 2024. URL
Abstract

The automation of chest X-ray reporting has garnered significant interest due to the time-consuming nature of the task. However, the clinical accuracy of free-text reports has proven challenging to quantify using natural language processing metrics, given the complexity of medical information, the variety of writing styles, and the potential for typos and inconsistencies. Structured reporting and standardized reports, on the other hand, can provide consistency and formalize the evaluation of clinical correctness. However, high-quality annotations for structured reporting are scarce. Therefore, we propose a method to predict clinical findings defined by sentences in structured reporting templates, which can be used to fill such templates. The approach involves training a contrastive language-image model using chest X-rays and related free-text radiological reports, then creating textual prompts for each structured finding and optimizing a classifier to predict clinical findings in the medical image. Results show that even with limited image-level annotations for training, the method can accomplish the structured reporting tasks of severity assessment of cardiomegaly and localizing pathologies in chest X-rays.

MCML Authors

[24] Top Journal
R. Wicklein, L. Kreitner, A. Wild, L. Aly, D. Rückert, B. Hemmer, T. Korn, M. Menten and B. Knier.
Retinal small vessel pathology is associated with disease burden in multiple sclerosis.
Multiple Sclerosis Journal 30.7 (Jun. 2024). DOI
Abstract

Background: Alterations of the superficial retinal vasculature are commonly observed in multiple sclerosis (MS) and can be visualized through optical coherence tomography angiography (OCTA).
Objectives: This study aimed to examine changes in the retinal vasculature during MS and to integrate findings into current concepts of the underlying pathology.
Methods: In this cross-sectional study, including 259 relapsing–remitting MS patients and 78 healthy controls, we analyzed OCTAs using deep-learning-based segmentation algorithm tools.
Results: We identified a loss of small-sized vessels (diameter < 10 µm) in the superficial vascular complex in all MS eyes, irrespective of their optic neuritis (ON) history. This alteration was associated with MS disease burden and appears independent of retinal ganglion cell loss. In contrast, an observed reduction of medium-sized vessels (diameter 10–20 µm) was specific to eyes with a history of ON and was closely linked to ganglion cell atrophy.
Conclusion: These findings suggest distinct atrophy patterns in retinal vessels in patients with MS. Further studies are necessary to investigate retinal vessel alterations and their underlying pathology in MS.

MCML Authors

[23]
A. Ziller, T. T. Mueller, S. Stieger, L. F. Feiner, J. Brandt, R. Braren, D. Rückert and G. Kaissis.
Reconciling privacy and accuracy in AI for medical imaging.
Nature Machine Intelligence 6 (Jun. 2024). DOI
Abstract

Artificial intelligence (AI) models are vulnerable to information leakage of their training data, which can be highly sensitive, for example, in medical imaging. Privacy-enhancing technologies, such as differential privacy (DP), aim to circumvent these susceptibilities. DP is the strongest possible protection for training models while bounding the risks of inferring the inclusion of training samples or reconstructing the original data. DP achieves this by setting a quantifiable privacy budget. Although a lower budget decreases the risk of information leakage, it typically also reduces the performance of such models. This imposes a trade-off between robust performance and stringent privacy. Additionally, the interpretation of a privacy budget remains abstract and challenging to contextualize. Here we contrast the performance of artificial intelligence models at various privacy budgets against both theoretical risk bounds and empirical success of reconstruction attacks. We show that using very large privacy budgets can render reconstruction attacks impossible, while drops in performance are negligible. We thus conclude that not using DP at all is negligent when applying artificial intelligence models to sensitive data. We deem our results to lay a foundation for further debates on striking a balance between privacy risks and model performance.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate


[22]
N. Stolt-Ansó, V. Sideri-Lampretsa, M. Dannecker and D. Rückert.
Intensity-based 3D motion correction for cardiac MR images.
ISBI 2024 - IEEE 21st International Symposium on Biomedical Imaging. Athens, Greece, May 27-30, 2024. DOI
Abstract

Cardiac magnetic resonance (CMR) image acquisition requires subjects to hold their breath while 2D cine images are acquired. This process assumes that the heart remains in the same position across all slices. However, differences in breathhold positions or patient motion introduce 3D slice misalignments. In this work, we propose an algorithm that simultaneously aligns all SA and LA slices by maximizing the pair-wise intensity agreement between their intersections. Unlike previous works, our approach is formulated as a subject-specific optimization problem and requires no prior knowledge of the underlying anatomy. We quantitatively demonstrate that the proposed method is robust against a large range of rotations and translations by synthetically misaligning 10 motion-free datasets and aligning them back using the proposed method.

MCML Authors

[21]
Y. Zhang, N. Stolt-Ansó, J. Pan, W. Huang, K. Hammernik and D. Rückert.
Direct Cardiac Segmentation from Undersampled K-Space using Transformers.
ISBI 2024 - IEEE 21st International Symposium on Biomedical Imaging. Athens, Greece, May 27-30, 2024. DOI
Abstract

The prevailing deep learning-based methods of predicting cardiac segmentation involve reconstructed magnetic resonance (MR) images. The heavy dependency of segmentation approaches on image quality significantly limits the acceleration rate in fast MR reconstruction. Moreover, the practice of treating reconstruction and segmentation as separate sequential processes leads to artifact generation and information loss in the intermediate stage. These issues pose a great risk to achieving high-quality outcomes. To leverage the redundant k-space information overlooked in this dual-step pipeline, we introduce a novel approach to directly deriving segmentations from sparse k-space samples using a transformer (DiSK). DiSK operates by globally extracting latent features from 2D+time k-space data with attention blocks and subsequently predicting the segmentation label of query points. We evaluate our model under various acceleration factors (ranging from 4 to 64) and compare against two image-based segmentation baselines. Our model consistently outperforms the baselines in Dice and Hausdorff distances across foreground classes for all presented sampling rates.

MCML Authors

[20]
Y. Zhang, N. Stolt-Ansó, J. Pan, W. Huang, K. Hammernik and D. Rückert.
Reconstruction-free segmentation from undersampled k-space using transformers.
ISMRM 2024 - International Society for Magnetic Resonance in Medicine Annual Meeting. Singapore, May 04-09, 2024. URL
Abstract

Motivation: High acceleration factors place a limit on MRI image reconstruction. This limit is extended to segmentation models when treating these as subsequent independent processes.
Goal(s): Our goal is to produce segmentations directly from sparse k-space measurements without the need for intermediate image reconstruction.
Approach: We employ a transformer architecture to encode global k-space information into latent features. The produced latent vectors condition queried coordinates during decoding to generate segmentation class probabilities.
Results: The model is able to produce better segmentations across high acceleration factors than image-based segmentation baselines.
Impact: Cardiac segmentation directly from undersampled k-space samples circumvents the need for an intermediate image reconstruction step. This allows the potential to assess myocardial structure and function on higher acceleration factors than methods that rely on images as input.

MCML Authors

[19]
S. T. Arasteh, A. Ziller, C. Kuhl, M. Makowski, S. Nebelung, R. Braren, D. Rückert, D. Truhn and G. Kaissis.
Preserving fairness and diagnostic accuracy in private large-scale AI models for medical imaging.
Communications Medicine 4.46 (Mar. 2024). DOI
Abstract

Background: Artificial intelligence (AI) models are increasingly used in the medical domain. However, as medical data is highly sensitive, special precautions to ensure its protection are required. The gold standard for privacy preservation is the introduction of differential privacy (DP) to model training. Prior work indicates that DP has negative implications on model accuracy and fairness, which are unacceptable in medicine and represent a main barrier to the widespread use of privacy-preserving techniques. In this work, we evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training.
Methods: We used two datasets: (1) A large dataset (N=193,311) of high quality clinical chest radiographs, and (2) a dataset (N=1625) of 3D abdominal computed tomography (CT) images, with the task of classifying the presence of pancreatic ductal adenocarcinoma (PDAC). Both were retrospectively collected and manually labeled by experienced radiologists. We then compared non-private deep convolutional neural networks (CNNs) and privacy-preserving (DP) models with respect to privacy-utility trade-offs measured as area under the receiver operating characteristic curve (AUROC), and privacy-fairness trade-offs, measured as Pearson’s r or Statistical Parity Difference.
Results: We find that, while the privacy-preserving training yields lower accuracy, it largely does not amplify discrimination against age, sex or co-morbidity. However, we find an indication that difficult diagnoses and subgroups suffer stronger performance hits in private training.
Conclusions: Our study shows that – under the challenging realistic circumstances of a real-life clinical dataset – the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate


[18]
L. Kreitner, J. C. Paetzold, N. Rauch, C. Chen, A. M. H. Ahmed M. Hagag, A. E. Fayed, S. Sivaprasad, S. Rausch, J. Weichsel, B. H. Menze, M. Harders, B. Knier, D. Rückert and M. Menten.
Synthetic Optical Coherence Tomography Angiographs for Detailed Retinal Vessel Segmentation Without Human Annotations.
IEEE Transactions on Medical Imaging 43.6 (Jan. 2024). DOI
Abstract

Optical coherence tomography angiography (OCTA) is a non-invasive imaging modality that can acquire high-resolution volumes of the retinal vasculature and aid the diagnosis of ocular, neurological and cardiac diseases. Segmenting the visible blood vessels is a common first step when extracting quantitative biomarkers from these images. Classical segmentation algorithms based on thresholding are strongly affected by image artifacts and limited signal-to-noise ratio. The use of modern, deep learning-based segmentation methods has been inhibited by a lack of large datasets with detailed annotations of the blood vessels. To address this issue, recent work has employed transfer learning, where a segmentation network is trained on synthetic OCTA images and is then applied to real data. However, the previously proposed simulations fail to faithfully model the retinal vasculature and do not provide effective domain adaptation. Because of this, current methods are unable to fully segment the retinal vasculature, in particular the smallest capillaries. In this work, we present a lightweight simulation of the retinal vascular network based on space colonization for faster and more realistic OCTA synthesis. We then introduce three contrast adaptation pipelines to decrease the domain gap between real and artificial images. We demonstrate the superior segmentation performance of our approach in extensive quantitative and qualitative experiments on three public datasets that compare our method to traditional computer vision algorithms and supervised training using human annotations. Finally, we make our entire pipeline publicly available, including the source code, pretrained models, and a large dataset of synthetic OCTA images.

MCML Authors

2023


[17] A* Conference
G. Kaissis, A. Ziller, S. Kolek, A. Riess and D. Rückert.
Optimal privacy guarantees for a relaxed threat model: Addressing sub-optimal adversaries in differentially private machine learning.
NeurIPS 2023 - 37th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Dec 10-16, 2023. URL
Abstract

Differentially private mechanisms restrict the membership inference capabilities of powerful (optimal) adversaries against machine learning models. Such adversaries are rarely encountered in practice. In this work, we examine a more realistic threat model relaxation, where (sub-optimal) adversaries lack access to the exact model training database, but may possess related or partial data. We then formally characterise and experimentally validate adversarial membership inference capabilities in this setting in terms of hypothesis testing errors. Our work helps users to interpret the privacy properties of sensitive data processing systems under realistic threat model relaxations and choose appropriate noise levels for their use-case.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate


[16] Top Journal
R. Raab, A. Küderle, A. Zakreuskaya, A. D. Stern, J. Klucken, G. Kaissis, D. Rückert, S. Boll, R. Eils, H. Wagener and B. M. Eskofier.
Federated electronic health records for the European Health Data Space.
The Lancet Digital Health 5.11 (Nov. 2023). DOI
Abstract

The European Commission’s draft for the European Health Data Space (EHDS) aims to empower citizens to access their personal health data and share it with physicians and other health-care providers. It further defines procedures for the secondary use of electronic health data for research and development. Although this planned legislation is undoubtedly a step in the right direction, implementation approaches could potentially result in centralised data silos that pose data privacy and security risks for individuals. To address this concern, we propose federated personal health data spaces, a novel architecture for storing, managing, and sharing personal electronic health records that puts citizens at the centre—both conceptually and technologically. The proposed architecture puts citizens in control by storing personal health data on a combination of personal devices rather than in centralised data silos. We describe how this federated architecture fits within the EHDS and can enable the same features as centralised systems while protecting the privacy of citizens. We further argue that increased privacy and control do not contradict the use of electronic health data for research and development. Instead, data sovereignty and transparency encourage active participation in studies and data sharing. This combination of privacy-by-design and transparent, privacy-preserving data sharing can enable health-care leaders to break the privacy-exploitation barrier, which currently limits the secondary use of health data in many cases.

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate


[15] A Conference
D. Scholz, B. Wiestler, D. Rückert and M. Menten.
Metrics to Quantify Global Consistency in Synthetic Medical Images.
DGM4 @MICCAI 2023 - 3rd International Workshop on Deep Generative Models at the 26th International Conference on Medical Image Computing and Computer Assisted Intervention. Vancouver, Canada, Oct 08-12, 2023. DOI
Abstract

Image synthesis is increasingly being adopted in medical image processing, for example for data augmentation or inter-modality image translation. In these critical applications, the generated images must fulfill a high standard of biological correctness. A particular requirement for these images is global consistency, i.e an image being overall coherent and structured so that all parts of the image fit together in a realistic and meaningful way. Yet, established image quality metrics do not explicitly quantify this property of synthetic images. In this work, we introduce two metrics that can measure the global consistency of synthetic images on a per-image basis. To measure the global consistency, we presume that a realistic image exhibits consistent properties, e.g., a person’s body fat in a whole-body MRI, throughout the depicted object or scene. Hence, we quantify global consistency by predicting and comparing explicit attributes of images on patches using supervised trained neural networks. Next, we adapt this strategy to an unlabeled setting by measuring the similarity of implicit image features predicted by a self-supervised trained network. Our results demonstrate that predicting explicit attributes of synthetic images on patches can distinguish globally consistent from inconsistent images. Implicit representations of images are less sensitive to assess global consistency but are still serviceable when labeled data is unavailable. Compared to established metrics, such as the FID, our method can explicitly measure global consistency on a per-image basis, enabling a dedicated analysis of the biological plausibility of single synthetic images.

MCML Authors

[14] A Conference
V. A. Zimmer, K. Hammernik, V. Sideri-Lampretsa, W. Huang, A. Reithmeir, D. Rückert and J. A. Schnabel.
Towards Generalised Neural Implicit Representations for Image Registration.
DGM4 @MICCAI 2023 - 3rd International Workshop on Deep Generative Models at the 26th International Conference on Medical Image Computing and Computer Assisted Intervention. Vancouver, Canada, Oct 08-12, 2023. DOI
Abstract

Neural implicit representations (NIRs) enable to generate and parametrize the transformation for image registration in a continuous way. By design, these representations are image-pair-specific, meaning that for each signal a new multi-layer perceptron has to be trained. In this work, we investigate for the first time the potential of existent NIR generalisation methods for image registration and propose novel methods for the registration of a group of image pairs using NIRs. To exploit the generalisation potential of NIRs, we encode the fixed and moving image volumes to latent representations, which are then used to condition or modulate the NIR. Using ablation studies on a 3D benchmark dataset, we show that our methods are able to generalise to a set of image pairs with a performance comparable to pairwise registration using NIRs when trained on and datasets. Our results demonstrate the potential of generalised NIRs for 3D deformable image registration.

MCML Authors

[13] A Conference
R. Holland, O. Leingang, C. Holmes, P. Anders, R. Kaye, S. Riedl, J. C. Paetzold, I. Ezhov, H. Bogunović, U. Schmidt-Erfurth, H. P. N. Scholl, S. Sivaprasad, A. J. Lotery, D. Rückert and M. Menten.
Clustering Disease Trajectories in Contrastive Feature Space for Biomarker Proposal in Age-Related Macular Degeneration.
MICCAI 2023 - 26th International Conference on Medical Image Computing and Computer Assisted Intervention. Vancouver, Canada, Oct 08-12, 2023. DOI
Abstract

Age-related macular degeneration (AMD) is the leading cause of blindness in the elderly. Current grading systems based on imaging biomarkers only coarsely group disease stages into broad categories that lack prognostic value for future disease progression. It is widely believed that this is due to their focus on a single point in time, disregarding the dynamic nature of the disease. In this work, we present the first method to automatically propose biomarkers that capture temporal dynamics of disease progression. Our method represents patient time series as trajectories in a latent feature space built with contrastive learning. Then, individual trajectories are partitioned into atomic sub-sequences that encode transitions between disease states. These are clustered using a newly introduced distance metric. In quantitative experiments we found our method yields temporal biomarkers that are predictive of conversion to late AMD. Furthermore, these clusters were highly interpretable to ophthalmologists who confirmed that many of the clusters represent dynamics that have previously been linked to the progression of AMD, even though they are currently not included in any clinical grading system.

MCML Authors

[12] A Conference
N. Stolt-Ansó, J. McGinnis, J. Pan, K. Hammernik and D. Rückert.
NISF: Neural implicit segmentation functions.
MICCAI 2023 - 26th International Conference on Medical Image Computing and Computer Assisted Intervention. Vancouver, Canada, Oct 08-12, 2023. DOI
Abstract

Segmentation of anatomical shapes from medical images has taken an important role in the automation of clinical measurements. While typical deep-learning segmentation approaches are performed on discrete voxels, the underlying objects being analysed exist in a real-valued continuous space. Approaches that rely on convolutional neural networks (CNNs) are limited to grid-like inputs and not easily applicable to sparse or partial measurements. We propose a novel family of image segmentation models that tackle many of CNNs’ shortcomings: Neural Implicit Segmentation Functions (NISF). Our framework takes inspiration from the field of neural implicit functions where a network learns a mapping from a real-valued coordinate-space to a shape representation. NISFs have the ability to segment anatomical shapes in high-dimensional continuous spaces. Training is not limited to voxelized grids, and covers applications with sparse and partial data. Interpolation between observations is learnt naturally in the training procedure and requires no post-processing. Furthermore, NISFs allow the leveraging of learnt shape priors to make predictions for regions outside of the original image plane. We go on to show the framework achieves dice scores of on a (3D+t) short-axis cardiac segmentation task using the UK Biobank dataset. We also provide a qualitative analysis on our frameworks ability to perform segmentation and image interpolation on unseen regions of an image volume at arbitrary resolutions.

MCML Authors

[11] A* Conference
M. Menten, J. C. Paetzold, V. A. Zimmer, S. Shit, I. Ezhov, R. Holland, M. Probst, J. A. Schnabel and D. Rückert.
A Skeletonization Algorithm for Gradient-Based Optimization.
ICCV 2023 - IEEE/CVF International Conference on Computer Vision. Paris, France, Oct 02-06, 2023. DOI
Abstract

The skeleton of a digital image is a compact representation of its topology, geometry, and scale. It has utility in many computer vision applications, such as image description, segmentation, and registration. However, skeletonization has only seen limited use in contemporary deep learning solutions. Most existing skeletonization algorithms are not differentiable, making it impossible to integrate them with gradient-based optimization. Compatible algorithms based on morphological operations and neural networks have been proposed, but their results often deviate from the geometry and topology of the true medial axis. This work introduces the first three-dimensional skeletonization algorithm that is both compatible with gradient-based optimization and preserves an object’s topology. Our method is exclusively based on matrix additions and multiplications, convolutional operations, basic non-linear functions, and sampling from a uniform probability distribution, allowing it to be easily implemented in any major deep learning library. In benchmarking experiments, we prove the advantages of our skeletonization algorithm compared to non-differentiable, morphological, and neural-network-based baselines. Finally, we demonstrate the utility of our algorithm by integrating it with two medical image processing applications that use gradient-based optimization: deep-learning-based blood vessel segmentation, and multimodal registration of the mandible in computed tomography and magnetic resonance images.

MCML Authors
Link to Profile Martin Menten

Martin Menten

Dr.

JRG Leader AI for Vision

Link to Profile Julia Schnabel

Julia Schnabel

Prof. Dr.

Principal Investigator


[10]
A. Khakzar.
Rethinking Feature Attribution for Neural Network Explanation.
Dissertation Aug. 2023. URL
Abstract

Feature attribution is arguably the predominant approach for illuminating black-box neural networks. This dissertation rethinks feature attribution by leveraging critical neural pathways, identifying input features with predictive information, and evaluating feature attribution using the neural network model. The dissertation also rethinks feature attribution for the explanation of medical imaging models.

MCML Authors

2022


[9] A Conference
P. Engstler, M. Keicher, D. Schinz, K. Mach, A. S. Gersing, S. C. Foreman, S. S. Goller, J. Weissinger, J. Rischewski, A.-S. Dietrich, B. Wiestler, J. S. Kirschke, A. Khakzar and N. Navab.
Interpretable Vertebral Fracture Diagnosis.
iMIMIC @MICCAI 2022 - Workshop on Interpretability of Machine Intelligence in Medical Image Computing at the 25th International Conference on Medical Image Computing and Computer Assisted Intervention. Singapore, Sep 18-22, 2022. DOI GitHub
Abstract

Do black-box neural network models learn clinically relevant features for fracture diagnosis? The answer not only establishes reliability, quenches scientific curiosity, but also leads to explainable and verbose findings that can assist the radiologists in the final and increase trust. This work identifies the concepts networks use for vertebral fracture diagnosis in CT images. This is achieved by associating concepts to neurons highly correlated with a specific diagnosis in the dataset. The concepts are either associated with neurons by radiologists pre-hoc or are visualized during a specific prediction and left for the user’s interpretation. We evaluate which concepts lead to correct diagnosis and which concepts lead to false positives. The proposed frameworks and analysis pave the way for reliable and explainable vertebral fracture diagnosis.

MCML Authors

[8] A* Conference
A. Khakzar, Y. Li, Y. Zhang, M. Sanisoglu, S. T. Kim, M. Rezaei, B. Bischl and N. Navab.
Analyzing the Effects of Handling Data Imbalance on Learned Features from Medical Images by Looking Into the Models.
IMLH @ICML 2022 - 2nd Workshop on Interpretable Machine Learning in Healthcare at the 39th International Conference on Machine Learning. Baltimore, MD, USA, Jul 17-23, 2022. arXiv
Abstract

One challenging property lurking in medical datasets is the imbalanced data distribution, where the frequency of the samples between the different classes is not balanced. Training a model on an imbalanced dataset can introduce unique challenges to the learning problem where a model is biased towards the highly frequent class. Many methods are proposed to tackle the distributional differences and the imbalanced problem. However, the impact of these approaches on the learned features is not well studied. In this paper, we look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features. We study several popular cost-sensitive approaches for handling data imbalance and analyze the feature maps of the convolutional neural networks from multiple perspectives: analyzing the alignment of salient features with pathologies and analyzing the pathology-related concepts encoded by the networks. Our study reveals differences and insights regarding the trained models that are not reflected by quantitative metrics such as AUROC and AP and show up only by looking at the models through a lens.

MCML Authors

[7] A* Conference
A. Khakzar, P. Khorsandi, R. Nobahari and N. Navab.
Do Explanations Explain? Model Knows Best.
CVPR 2022 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, LA, USA, Jun 19-24, 2022. DOI GitHub
Abstract

It is a mystery which input features contribute to a neural network’s output. Various explanation (feature attribution) methods are proposed in the literature to shed light on the problem. One peculiar observation is that these explanations (attributions) point to different features as being important. The phenomenon raises the question, which explanation to trust? We propose a framework for evaluating the explanations using the neural network model itself. The framework leverages the network to generate input features that impose a particular behavior on the output. Using the generated features, we devise controlled experimental setups to evaluate whether an explanation method conforms to an axiom. Thus we propose an empirical framework for axiomatic evaluation of explanation methods. We evaluate well-known and promising explanation solutions using the proposed framework. The framework provides a toolset to reveal properties and drawbacks within existing and future explanation solutions

MCML Authors

[6]
M. Keicher, K. Zaripova, T. Czempiel, K. Mach, A. Khakzar and N. Navab.
Few-shot Structured Radiology Report Generation Using Natural Language Prompts.
Preprint (Mar. 2022). arXiv
Abstract

The automation of chest X-ray reporting has garnered significant interest due to the time-consuming nature of the task. However, the clinical accuracy of free-text reports has proven challenging to quantify using natural language processing metrics, given the complexity of medical information, the variety of writing styles, and the potential for typos and inconsistencies. Structured reporting and standardized reports, on the other hand, can provide consistency and formalize the evaluation of clinical correctness. However, high-quality annotations for structured reporting are scarce. Therefore, we propose a method to predict clinical findings defined by sentences in structured reporting templates, which can be used to fill such templates. The approach involves training a contrastive language-image model using chest X-rays and related free-text radiological reports, then creating textual prompts for each structured finding and optimizing a classifier to predict clinical findings in the medical image. Results show that even with limited image-level annotations for training, the method can accomplish the structured reporting tasks of severity assessment of cardiomegaly and localizing pathologies in chest X-rays.

MCML Authors

2021


[5] A* Conference
Y. Zhang, A. Khakzar, Y. Li, A. Farshad, S. T. Kim and N. Navab.
Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information.
NeurIPS 2021 - Track on Datasets and Benchmarks at the 35th Conference on Neural Information Processing Systems. Virtual, Dec 06-14, 2021. URL
Abstract

One principal approach for illuminating a black-box neural network is feature attribution, i.e. identifying the importance of input features for the network’s prediction. The predictive information of features is recently proposed as a proxy for the measure of their importance. So far, the predictive information is only identified for latent features by placing an information bottleneck within the network. We propose a method to identify features with predictive information in the input domain. The method results in fine-grained identification of input features’ information and is agnostic to network architecture. The core idea of our method is leveraging a bottleneck on the input that only lets input features associated with predictive latent features pass through. We compare our method with several feature attribution methods using mainstream feature attribution evaluation experiments. The code is publicly available.

MCML Authors

[4] A Conference
A. Khakzar, S. Musatian, J. Buchberger, I. V. Quiroz, N. Pinger, S. Baselizadeh, S. T. Kim and N. Navab.
Towards Semantic Interpretation of Thoracic Disease and COVID-19 Diagnosis Models.
MICCAI 2021 - 24th International Conference on Medical Image Computing and Computer Assisted Intervention. Strasbourg, France, Sep 27-Oct 01, 2021. DOI GitHub
Abstract

Convolutional neural networks are showing promise in the automatic diagnosis of thoracic pathologies on chest x-rays. Their black-box nature has sparked many recent works to explain the prediction via input feature attribution methods (aka saliency methods). However, input feature attribution methods merely identify the importance of input regions for the prediction and lack semantic interpretation of model behavior. In this work, we first identify the semantics associated with internal units (feature maps) of the network. We proceed to investigate the following questions; Does a regression model that is only trained with COVID-19 severity scores implicitly learn visual patterns associated with thoracic pathologies? Does a network that is trained on weakly labeled data (e.g. healthy, unhealthy) implicitly learn pathologies? Moreover, we investigate the effect of pretraining and data imbalance on the interpretability of learned features. In addition to the analysis, we propose semantic attribution to semantically explain each prediction. We present our findings using publicly available chest pathologies (CheXpert [5], NIH ChestX-ray8 [25]) and COVID-19 datasets (BrixIA [20], and COVID-19 chest X-ray segmentation dataset [4]).

MCML Authors

[3] A Conference
A. Khakzar, Y. Zhang, W. Mansour, Y. Cai, Y. Li, Y. Zhang, S. T. Kim and N. Navab.
Explaining COVID-19 and Thoracic Pathology Model Predictions by Identifying Informative Input Features.
MICCAI 2021 - 24th International Conference on Medical Image Computing and Computer Assisted Intervention. Strasbourg, France, Sep 27-Oct 01, 2021. DOI GitHub
Abstract

Neural networks have demonstrated remarkable performance in classification and regression tasks on chest X-rays. In order to establish trust in the clinical routine, the networks’ prediction mechanism needs to be interpretable. One principal approach to interpretation is feature attribution. Feature attribution methods identify the importance of input features for the output prediction. Building on Information Bottleneck Attribution (IBA) method, for each prediction we identify the chest X-ray regions that have high mutual information with the network’s output. Original IBA identifies input regions that have sufficient predictive information. We propose Inverse IBA to identify all informative regions. Thus all predictive cues for pathologies are highlighted on the X-rays, a desirable property for chest X-ray diagnosis. Moreover, we propose Regression IBA for explaining regression models. Using Regression IBA we observe that a model trained on cumulative severity score labels implicitly learns the severity of different X-ray regions. Finally, we propose Multi-layer IBA to generate higher resolution and more detailed attribution/saliency maps. We evaluate our methods using both human-centric (ground-truth-based) interpretability metrics, and human-agnostic feature importance metrics on NIH Chest X-ray8 and BrixIA datasets.

MCML Authors

[2] A* Conference
A. Khakzar, S. Baselizadeh, S. Khanduja, C. Rupprecht, S. T. Kim and N. Navab.
Neural Response Interpretation through the Lens of Critical Pathways.
CVPR 2021 - IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual, Jun 19-25, 2021. DOI
Abstract

Is critical input information encoded in specific sparse pathways within the neural network? In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network’s response to an input. The pruning objective — selecting the smallest group of neurons for which the response remains equivalent to the original network — has been previously proposed for identifying critical pathways. We demonstrate that sparse pathways derived from pruning do not necessarily encode critical input information. To ensure sparse pathways include critical fragments of the encoded input information, we propose pathway selection via neurons’ contribution to the response. We proceed to explain how critical pathways can reveal critical input features. We prove that pathways selected via neuron contribution are locally linear (in an ℓ 2 -ball), a property that we use for proposing a feature attribution method: ‘pathway gradient’. We validate our interpretation method using mainstream evaluation experiments. The validation of pathway gradient interpretation method further confirms that selected pathways using neuron contributions correspond to critical input features. The code 1 2 is publicly available.

MCML Authors

2020


[1] A Conference
S. Denner, A. Khakzar, M. Sajid, M. Saleh, Z. Spiclin, S. T. Kim and N. Navab.
Spatio-temporal learning from longitudinal data for multiple sclerosis lesion segmentation.
BrainLes @MICCAI 2020 - Workshop on Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries at the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention. Virtual, Oct 04-08, 2020. DOI GitHub
Abstract

Segmentation of Multiple Sclerosis (MS) lesions in longitudinal brain MR scans is performed for monitoring the progression of MS lesions. We hypothesize that the spatio-temporal cues in longitudinal data can aid the segmentation algorithm. Therefore, we propose a multi-task learning approach by defining an auxiliary self-supervised task of deformable registration between two time-points to guide the neural network toward learning from spatio-temporal changes. We show the efficacy of our method on a clinical dataset comprised of 70 patients with one follow-up study for each patient. Our results show that spatio-temporal information in longitudinal data is a beneficial cue for improving segmentation. We improve the result of current state-of-the-art by 2.6% in terms of overall score (p < 0.05).

MCML Authors