leads the MCML Junior Research Group 'Multi-Modal Learning' at TU Munich.
She and her team conduct research into multi-modal learning from vision, sound, and text. They focus on advancing video understanding, with an emphasis on capturing temporal dynamics and cross-modal relationships. To achieve this, they aim to improve the combination of information from various modalities within learning frameworks. Furthermore, they are exploring how to adapt large pre-trained models for audio-visual understanding tasks.