13.03.2025

ReNO: A Smarter Way to Enhance AI-Generated Images

MCML Research Insight - With Luca Eyring, Shyamgopal Karthik, Karsten Roth and Zeynep Akata

Despite their impressive capabilities, Text-to-Image (T2I) models frequently misinterpret detailed prompts, leading to errors in object positioning, attribute accuracy, and color fidelity. Traditional improvements rely on extensive dataset training, which is not only computationally expensive but also may not generalize well to unseen prompts. To address this, our Junior Members – Luca Eyring, Shyamgopal Karthik, and Karsten Roth – together with PI Zeynep Akata and collaborator Alexey Dosovitskiy from inceptive, proposed a new approach: Reward-based Noise Optimization (ReNO).

The Challenge: When AI Gets It Wrong

«T2I models often struggle with complex prompts, leading to issues like incorrect text rendering, attribute mismatches, unrealistic object combinations, and color leakage.»

Luca Eyring et al.

MCML Junior Members

Recent advancements in T2I models have significantly improved AI-generated visuals, yet challenges remain when it comes to complex compositions, fine details, and spatial accuracy. For example, when asked to generate “a blue scooter parked near a curb in front of a green vintage car”, many models might struggle with correct object placement, resulting in overlapping or misplaced elements. Fixing these issues traditionally requires costly retraining for incremental improvements.

Another key issue is reward hacking—a phenomenon where AI models optimize for high reward scores without truly improving image quality. This happens when a reward model favors certain characteristics, leading the AI to exploit shortcuts instead of genuinely following the prompt. This can result in images that score well on automated evaluations but fail to meet human expectations.

To address these challenges, the authors propose ReNO (Reward-based Noise Optimization), a solution that enhances image generation without costly model retraining. ReNO refines the initial noise using signals from one or multiple human preference reward models, improving spatial accuracy and object placement to produce more coherent and visually accurate outputs.

Example Prompt: “A blue scooter parked near a curb in front of a green vintage car”

©L. Eyring et al.

Generated image with an object positioning error. The green vintage car is far too far away.

©L. Eyring et al.

ReNO refinement with correct positioning of the green vintage car.

«Within a computational budget of 20-50 s, ReNO-enhanced one-step models consistently surpass the performance of all current open-source T2I models.»

Luca Eyring et al.

MCML Junior Members

The ReNO Approach: Smarter Noise, Better Results

ReNO enhances image generation at inference time by optimizing the starting noise. Instead of altering the model itself, ReNO fine-tunes the initial noise using reward models—AI systems trained to evaluate image quality and prompt adherence. This iterative optimization process refines compositions, improves object rendering, and enhances overall image coherence while maintaining computational efficiency.

To mitigate reward hacking, ReNO combines multiple reward objectives, preventing the system from over-optimizing for any single metric. This ensures a balanced approach where images are both aesthetically pleasing and faithful to their prompts.

Qualitative Results of two different one-step T2I models with and without using ReNO over the prompt: “A curious, orange fox and a fluffy, white rabbit, playing together in a lush, green meadow filled with yellow dandelions”

©L. Eyring et al.

PixArt-a DMD

©L. Eyring et al.

PixArt-a DMD + ReNO

©L. Eyring et al.

SD-Turbo

©L. Eyring et al.

SD-Turbo + ReNO

Key Benefits of ReNO

Improved Prompt Adherence – Ensures better alignment between text descriptions and generated images.
Efficient Performance – Works within a 20-50 second processing window, making it practical for real-world use.
Competitive Results – ReNO-enhanced images outperform major open-source models

«Our results highlight the importance of the noise distribution in T2I models and encourage further research into understanding and adapting it.»

Luca Eyring et al.

MCML Junior Members

The Future of AI-Driven Creativity

ReNO demonstrates that enhancing AI-generated images doesn’t require extensive model retraining. By focusing on noise optimization, it provides a practical, efficient, and scalable solution for improving T2I performance. This innovation could lead to more accessible, high-quality AI art tools.

Curious to explore ReNO? The code is open-source—check out the GitHub repository with over 100 ⭐ and experience the future of AI-driven image generation. You can also try it out directly with the Hugging Face Demo.

ReNO at GitHub

Hugging Face Demo

Here’s ReNO in action

We have tried Reno ourselves and were impressed. The first image was generated using the SDXL Turbo model without ReNO, while the second image was generated using the Hugging Face demo. Both with the prompt: “An orange chair to the right of a black airplane”. For the other two pictures we used the same model and the prompt was: “A BMW GS 1250 with a driver drinking coffee” - and not with a coffee as co-rider 🙃.

SDXL Turbo: An orange chair to the right of a black airplane

SDXL Turbo + ReNO: An orange chair to the right of a black airplane

SDXL Turbo: A BMW GS 1250 with a driver drinking coffee

SDXL Turbo + ReNO: A BMW GS 1250 with a driver drinking coffee

Explore the full paper published at NeurIPS 2024, one of the most prestigious conferences in the field of Machine Learning and Artificial Intelligence.

L. Eyring, S. Karthik, K. Roth, A. Dosovitskiy and Z. Akata.
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization.
NeurIPS 2024 - 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. URL GitHub

Abstract

Text-to-Image (T2I) models have made significant advancements in recent years, but they still struggle to accurately capture intricate details specified in complex compositional prompts. While fine-tuning T2I models with reward objectives has shown promise, it suffers from ‘reward hacking’ and may not generalize well to unseen prompt distributions. In this work, we propose Reward-based Noise Optimization (ReNO), a novel approach that enhances T2I models at inference by optimizing the initial noise based on the signal from one or multiple human preference reward models. Remarkably, solving this optimization problem with gradient ascent for 50 iterations yields impressive results on four different one-step models across two competitive benchmarks, T2I-CompBench and GenEval. Within a computational budget of 20-50 seconds, ReNO-enhanced one-step models consistently surpass the performance of all current open-source Text-to-Image models. Extensive user studies demonstrate that our model is preferred nearly twice as often compared to the popular SDXL model and is on par with the proprietary Stable Diffusion 3 with 8B parameters. Moreover, given the same computational resources, a ReNO-optimized one-step model outperforms widely-used open-source models such as SDXL and PixArt-α, highlighting the efficiency and effectiveness of ReNO in enhancing T2I model performance at inference time.

MCML Authors

Luca Eyring

B1 | Computer Vision
→ Group Zeynep Akata

Interpretable and Reliable Machine Learning

Shyamgopal Karthik

B1 | Computer Vision
→ Group Zeynep Akata

Interpretable and Reliable Machine Learning

Karsten Roth

B1 | Computer Vision
→ Group Zeynep Akata

Interpretable and Reliable Machine Learning

Zeynep Akata

Prof. Dr.

B1 | Computer Vision

Interpretable and Reliable Machine Learning

Poster Session at NeurIPS 2024

If you are interested in learning more about Zeynep Akatas group’s research, please visit EML Munich.

EML Munich

Share Your Research!

Get in touch with us!

Are you an MCML Junior Member and interested in showcasing your research on our blog?

We’re happy to feature your work—get in touch with us to present your paper.

13.03.2025

Subscribe to RSS News feed

06.08.2025

What Is Intelligence—and What Kind of Intelligence Do We Want in Our Future? With Sven Nyholm

Sven Nyholm explores how AI reshapes authorship, responsibility and creativity, calling for democratic oversight in shaping our AI future.

04.08.2025

AI for Better Social Media - With Researcher Dominik Bär

Dominik Bär develops AI for real-time counterspeech to combat hate and misinformation, part of the KI Trans project on AI in education.

31.07.2025

From Vulnerable to Verified: Exact Certificates Shield Models From Label‑Flipping

Published as a spotlight presentation at ICLR 2025, the paper certifies neural networks against label poisoning.

30.07.2025

Tracking Our Changing Planet From Space - With Xiaoxiang Zhu

In this video, Xiaoxiang Zhu shares how her team extracts geo-information from petabytes of data, with real impact on global challenges.

29.07.2025

AI for Enhanced Eye Diagnostics - With Researcher Lucie Huang

Lucie Huang develops AI for faster eye scans and earlier diagnoses, featured in a new KI Trans video on real-world AI impact.

ReNO: A Smarter Way to Enhance AI-Generated Images

MCML Research Insight - With Luca Eyring, Shyamgopal Karthik, Karsten Roth and Zeynep Akata

The Challenge: When AI Gets It Wrong

Example Prompt: “A blue scooter parked near a curb in front of a green vintage car”

The ReNO Approach: Smarter Noise, Better Results

Key Benefits of ReNO

The Future of AI-Driven Creativity

Read More

Here’s ReNO in action

Related