Home  | Publications | BLS+25

Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate

Link to Profile Martin Menten

Martin Menten

Dr.

JRG Leader AI for Vision

Abstract

Direct image-to-graph transformation is a challenging task that involves solving object detection and relationship prediction in a single model. Due to this task's complexity, large training datasets are rare in many domains, making the training of deep-learning methods challenging. This data sparsity necessitates transfer learning strategies akin to the state-of-the-art in general computer vision. In this work, we introduce a set of methods enabling cross-domain and cross-dimension learning for image-to-graph transformers. We propose (1) a regularized edge sampling loss to effectively learn object relations in multiple domains with different numbers of edges, (2) a domain adaptation framework for image-to-graph transformers aligning image- and graph-level features from different domains, and (3) a projection function that allows using 2D data for training 3D transformers. We demonstrate our method's utility in cross-domain and cross-dimension experiments, where we utilize labeled data from 2D road networks for simultaneous learning in vastly different target domains. Our method consistently outperforms standard transfer learning and self-supervised pretraining on challenging benchmarks, such as retinal or whole-brain vessel graph extraction.

inproceedings


WACV 2025

IEEE/CVF Winter Conference on Applications of Computer Vision. Tucson, AZ, USA, Feb 28-Mar 04, 2025.
Conference logo
A Conference

Authors

A. H. Berger • L. Lux • S. Shit • I. Ezhov • G. KaissisM. J. MentenD. Rückert • J. C. Paetzold

Links

DOI

Research Area

 C1 | Medicine

BibTeXKey: BLS+25

Back to Top