13.06.2025

Tiny logo
Teaser image to MCML PI Volker Tresp and his team win ReGenAI @CVPR 2025 Best Paper Award

MCML PI Volker Tresp and His Team Win ReGenAI @CVPR 2025 Best Paper Award

Honored for Work on New Jailbreak Vulnerability in T2I Diffusion Models

MCML Junior Members Tong Liu, Gengyuan Zhang, Shuo Chen, and MCML PI Volker Tresp and their co-authors have received the Best Paper Award at the Second Workshop on Responsible Generative AI (ReGenAI) Workshop at CVPR 2025 for their paper “Multimodal Pragmatic Jailbreak on Text-to-image Models”.

The authors show that text-to-image models can be easily exploited to produce unsafe content through cross‑modal interactions between safe text and images, a vulnerability that current safety filters fail to address effectively.

Congratulations from us!

Check out the full paper:

T. Liu, Z. Lai, J. Wang, G. Zhang, S. Chen, P. Torr, V. Demberg, V. Tresp and J. Gu.
Multimodal Pragmatic Jailbreak on Text-to-image Models.
ReGenAI @CVPR 2025 - 2nd Workshop on Responsible Generative AI at IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025). Nashville, TN, USA, Jun 11-15, 2025. Best Paper Award. URL GitHub
Abstract

Diffusion models have recently achieved remarkable advancements in terms of image quality and fidelity to textual prompts. Concurrently, the safety of such generative models has become an area of growing concern. This work introduces a novel type of jailbreak, which triggers T2I models to generate the image with visual text, where the image and the text, although considered to be safe in isolation, combine to form unsafe content. To systematically explore this phenomenon, we propose a dataset to evaluate the current diffusion-based text-to-image (T2I) models under such jailbreak. We benchmark nine representative T2I models, including two closed-source commercial models. Experimental results reveal a concerning tendency to produce unsafe content: all tested models suffer from such type of jailbreak, with rates of unsafe generation ranging from around 10% to 70% where DALLE 3 demonstrates almost the highest unsafety. In real-world scenarios, various filters such as keyword blocklists, customized prompt filters, and NSFW image filters, are commonly employed to mitigate these risks. We evaluate the effectiveness of such filters against our jailbreak and found that, while these filters may be effective for single modality detection, they fail to work against our jailbreak. We also investigate the underlying reason for such jailbreaks, from the perspective of text rendering capability and training data. Our work provides a foundation for further development towards more secure and reliable T2I models.

MCML Authors
Link to website

Tong Liu

Database Systems and Data Mining AI Lab

Link to website

Gengyuan Zhang

Database Systems and Data Mining AI Lab

Link to website

Shuo Chen

Database Systems and Data Mining AI Lab

Link to Profile Volker Tresp

Volker Tresp

Prof. Dr.

Database Systems and Data Mining AI Lab

13.06.2025


Subscribe to RSS News feed

Related

Link to Fabian Theis receives 2025 ISCB Innovator Award

01.08.2025

Fabian Theis Receives 2025 ISCB Innovator Award

Fabian Theis receives 2025 ISCB Innovator Award for advancing AI in biology and mentoring the next generation of scientists.

Link to Yusuf Sale receives IJAR Young Researcher Award

29.07.2025

Yusuf Sale Receives IJAR Young Researcher Award

MCML Junior Member Yusuf Sale received an IJAR Young Researcher Award at ISIPTA 2025 for his work.

Link to Barbara Plank awarded 2025 Imminent Research Grant for work on language data

29.07.2025

Barbara Plank Awarded 2025 Imminent Research Grant for Work on Language Data

Barbara Plank’s MaiNLP lab wins 2025 Imminent Research Grant for a project on language data with Peng and de Marneffe.

Link to Eyke Hüllermeier to Lead New DFG-Funded Research Training Group METEOR

22.07.2025

Eyke Hüllermeier to Lead New DFG-Funded Research Training Group METEOR

MCML PI Eyke Hüllermeier to lead new DFG-funded RTG METEOR, uniting ML and control theory to build robust, explainable AI systems.

Link to Outstanding Paper Award at ICML 2025 for MCML Researchers

18.07.2025

Outstanding Paper Award at ICML 2025 for MCML Researchers

MCML researchers win ICML 2025 Outstanding Paper Award for work on prediction and identifying the worst-off.