Home | Research | Groups | Xi Wang

Research Group Xi Wang


Link to website at TUM

Xi Wang

Dr.

JRG Leader Egocentric Vision

Computer Vision & Artificial Intelligence

Xi Wang

leads the MCML Junior Research Group ‘Egocentric Vision’ at TU Munich.

Xi Wang and her team conduct cutting-edge research in egocentric vision, focusing on learning from first-person human videos to understand behavior patterns and extract valuable information for potential applications in robotics. Their ongoing projects include 3D reconstruction using Gaussian splitting and multimodal learning with vision-language models. Funded as a BMBF project, the group maintains close ties with MCML and actively seeks collaborations that bridge egocentric vision with other research domains, extending beyond our own focus.

Team members @MCML

PostDocs

Link to website

Riccardo Marin

Dr.

Computer Vision & Artificial Intelligence

PhD Students

Link to website

Abhishek Saroha

Computer Vision & Artificial Intelligence

Link to website

Dominik Schnaus

Computer Vision & Artificial Intelligence

Publications @MCML

2025


[4]
C. Koke, D. Schnaus, Y. Shen, A. Saroha, M. Eisenberger, B. Rieck, M. M. Bronstein and D. Cremers.
On multi-scale Graph Representation Learning.
LMRL @ICLR 2025 - Workshop on Learning Meaningful Representations of Life at the 13th International Conference on Learning Representations (ICLR 2025). Singapore, Apr 24-28, 2025. To be published. Preprint available. URL
Abstract

null

MCML Authors
Link to website

Christian Koke

Computer Vision & Artificial Intelligence

Link to website

Dominik Schnaus

Computer Vision & Artificial Intelligence

Link to website

Yuesong Shen

Computer Vision & Artificial Intelligence

Link to website

Abhishek Saroha

Computer Vision & Artificial Intelligence

Link to Profile Daniel Cremers

Daniel Cremers

Prof. Dr.

Computer Vision & Artificial Intelligence


[3]
L. Sang, Z. Canfes, D. Cao, R. Marin, F. Bernard and D. Cremers.
TwoSquared: 4D Generation from 2D Image Pairs.
Preprint (Apr. 2025). arXiv
Abstract

Despite the astonishing progress in generative AI, 4D dynamic object generation remains an open challenge. With limited high-quality training data and heavy computing requirements, the combination of hallucinating unseen geometry together with unseen movement poses great challenges to generative models. In this work, we propose TwoSquared as a method to obtain a 4D physically plausible sequence starting from only two 2D RGB images corresponding to the beginning and end of the action. Instead of directly solving the 4D generation problem, TwoSquared decomposes the problem into two steps: 1) an image-to-3D module generation based on the existing generative model trained on high-quality 3D assets, and 2) a physically inspired deformation module to predict intermediate movements. To this end, our method does not require templates or object-class-specific prior knowledge and can take in-the-wild images as input. In our experiments, we demonstrate that TwoSquared is capable of producing texture-consistent and geometry-consistent 4D sequences only given 2D images.

MCML Authors
Link to website

Lu Sang

Computer Vision & Artificial Intelligence

Link to website

Riccardo Marin

Dr.

Computer Vision & Artificial Intelligence

Link to Profile Daniel Cremers

Daniel Cremers

Prof. Dr.

Computer Vision & Artificial Intelligence


[2]
N. P. A. Vu, A. Saroha, O. Litany and D. Cremers.
GAS-NeRF: Geometry-Aware Stylization of Dynamic Radiance Fields.
Preprint (Mar. 2025). arXiv
Abstract

Current 3D stylization techniques primarily focus on static scenes, while our world is inherently dynamic, filled with moving objects and changing environments. Existing style transfer methods primarily target appearance – such as color and texture transformation – but often neglect the geometric characteristics of the style image, which are crucial for achieving a complete and coherent stylization effect. To overcome these shortcomings, we propose GAS-NeRF, a novel approach for joint appearance and geometry stylization in dynamic Radiance Fields. Our method leverages depth maps to extract and transfer geometric details into the radiance field, followed by appearance transfer. Experimental results on synthetic and real-world datasets demonstrate that our approach significantly enhances the stylization quality while maintaining temporal coherence in dynamic scenes.

MCML Authors
Link to website

Abhishek Saroha

Computer Vision & Artificial Intelligence

Link to Profile Daniel Cremers

Daniel Cremers

Prof. Dr.

Computer Vision & Artificial Intelligence


2024


[1]
L. Sang, M. Gao, A. Saroha and D. Cremers.
Enhancing Surface Neural Implicits with Curvature-Guided Sampling and Uncertainty-Augmented Representations.
Wild3D @ECCV 2024 - Workshop 3D Modeling, Reconstruction, and Generation in the Wild at the 18th European Conference on Computer Vision (ECCV 2024). Milano, Italy, Sep 29-Oct 04, 2024. URL
Abstract

Neural implicits are a widely used surface presentation because they offer an adaptive resolution and support arbitrary topology changes. While previous works rely on ground truth point clouds or meshes, they often do not discuss the data acquisition and ignore the effect of input quality and sampling methods during reconstruction. In this paper, we introduce a sampling method with an uncertainty-augmented surface implicit representation that employs a sampling technique that considers the geometric characteristics of inputs. To this end, we introduce a strategy that efficiently computes differentiable geometric features, namely, mean curvatures, to guide the sampling phase during the training period. The uncertainty augmentation offers insights into the occupancy and reliability of the output signed distance value, thereby expanding representation capabilities into open surfaces. Finally, we demonstrate that our method improves the reconstruction of both synthetic and real-world data.

MCML Authors
Link to website

Lu Sang

Computer Vision & Artificial Intelligence

Link to website

Maolin Gao

Computer Vision & Artificial Intelligence

Link to website

Abhishek Saroha

Computer Vision & Artificial Intelligence

Link to Profile Daniel Cremers

Daniel Cremers

Prof. Dr.

Computer Vision & Artificial Intelligence