13.03.2025

Teaser image to ReNO: A Smarter Way to Enhance AI-Generated Images

ReNO: A Smarter Way to Enhance AI-Generated Images

MCML Research Insight - With Luca Eyring, Shyamgopal Karthik, Karsten Roth and Zeynep Akata

Despite their impressive capabilities, Text-to-Image (T2I) models frequently misinterpret detailed prompts, leading to errors in object positioning, attribute accuracy, and color fidelity. Traditional improvements rely on extensive dataset training, which is not only computationally expensive but also may not generalize well to unseen prompts. To address this, our Junior Members – Luca Eyring, Shyamgopal Karthik, and Karsten Roth – together with PI Zeynep Akata and collaborator Alexey Dosovitskiy from inceptive, proposed a new approach: Reward-based Noise Optimization (ReNO).


The Challenge: When AI Gets It Wrong

«T2I models often struggle with complex prompts, leading to issues like incorrect text rendering, attribute mismatches, unrealistic object combinations, and color leakage.»


Luca Eyring et al.

MCML Junior Members

Recent advancements in T2I models have significantly improved AI-generated visuals, yet challenges remain when it comes to complex compositions, fine details, and spatial accuracy. For example, when asked to generate “a blue scooter parked near a curb in front of a green vintage car”, many models might struggle with correct object placement, resulting in overlapping or misplaced elements. Fixing these issues traditionally requires costly retraining for incremental improvements.

Another key issue is reward hacking—a phenomenon where AI models optimize for high reward scores without truly improving image quality. This happens when a reward model favors certain characteristics, leading the AI to exploit shortcuts instead of genuinely following the prompt. This can result in images that score well on automated evaluations but fail to meet human expectations.

To address these challenges, the authors propose ReNO (Reward-based Noise Optimization), a solution that enhances image generation without costly model retraining. ReNO refines the initial noise using signals from one or multiple human preference reward models, improving spatial accuracy and object placement to produce more coherent and visually accurate outputs.

Example Prompt: “A blue scooter parked near a curb in front of a green vintage car”

Example of an object positioning error

Generated image with an object positioning error. The green vintage car is far too far away.

ReNO refinement with correct positioning of the green vintage car.

ReNO refinement with correct positioning of the green vintage car.


«Within a computational budget of 20-50 s, ReNO-enhanced one-step models consistently surpass the performance of all current open-source T2I models.»


Luca Eyring et al.

MCML Junior Members

The ReNO Approach: Smarter Noise, Better Results

ReNO enhances image generation at inference time by optimizing the starting noise. Instead of altering the model itself, ReNO fine-tunes the initial noise using reward models—AI systems trained to evaluate image quality and prompt adherence. This iterative optimization process refines compositions, improves object rendering, and enhances overall image coherence while maintaining computational efficiency.

To mitigate reward hacking, ReNO combines multiple reward objectives, preventing the system from over-optimizing for any single metric. This ensures a balanced approach where images are both aesthetically pleasing and faithful to their prompts.

Qualitative Results of two different one-step T2I models with and without using ReNO over the prompt: “A curious, orange fox and a fluffy, white rabbit, playing together in a lush, green meadow filled with yellow dandelions”

PixArt-a DMD

PixArt-a DMD

PixArt-a DMD + ReNO

PixArt-a DMD + ReNO

SD-Turbo

SD-Turbo

SD-Turbo + ReNO

SD-Turbo + ReNO


Key Benefits of ReNO

  • Improved Prompt Adherence – Ensures better alignment between text descriptions and generated images.
  • Efficient Performance – Works within a 20-50 second processing window, making it practical for real-world use.
  • Competitive Results – ReNO-enhanced images outperform major open-source models

«Our results highlight the importance of the noise distribution in T2I models and encourage further research into understanding and adapting it.»


Luca Eyring et al.

MCML Junior Members

The Future of AI-Driven Creativity

ReNO demonstrates that enhancing AI-generated images doesn’t require extensive model retraining. By focusing on noise optimization, it provides a practical, efficient, and scalable solution for improving T2I performance. This innovation could lead to more accessible, high-quality AI art tools.


Read More

Curious to explore ReNO? The code is open-source—check out the GitHub repository with over 100 ⭐ and experience the future of AI-driven image generation. You can also try it out directly with the Hugging Face Demo.

ReNO at GitHub
Hugging Face Demo

Here’s ReNO in action

We have tried Reno ourselves and were impressed. The first image was generated using the SDXL Turbo model without ReNO, while the second image was generated using the Hugging Face demo. Both with the prompt: “An orange chair to the right of a black airplane”. For the other two pictures we used the same model and the prompt was: “A BMW GS 1250 with a driver drinking coffee” - and not with a coffee as co-rider 🙃.

SDXL Turbo

SDXL Turbo: An orange chair to the right of a black airplane

SDXL Turbo  + ReNO

SDXL Turbo + ReNO: An orange chair to the right of a black airplane

SDXL Turbo

SDXL Turbo: A BMW GS 1250 with a driver drinking coffee

SDXL Turbo + ReNO

SDXL Turbo + ReNO: A BMW GS 1250 with a driver drinking coffee


Explore the full paper published at NeurIPS 2024, one of the most prestigious conferences in the field of Machine Learning and Artificial Intelligence.

L. Eyring, S. Karthik, K. Roth, A. Dosovitskiy and Z. Akata.
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization.
NeurIPS 2024 - 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. URL GitHub
Abstract

Text-to-Image (T2I) models have made significant advancements in recent years, but they still struggle to accurately capture intricate details specified in complex compositional prompts. While fine-tuning T2I models with reward objectives has shown promise, it suffers from ‘reward hacking’ and may not generalize well to unseen prompt distributions. In this work, we propose Reward-based Noise Optimization (ReNO), a novel approach that enhances T2I models at inference by optimizing the initial noise based on the signal from one or multiple human preference reward models. Remarkably, solving this optimization problem with gradient ascent for 50 iterations yields impressive results on four different one-step models across two competitive benchmarks, T2I-CompBench and GenEval. Within a computational budget of 20-50 seconds, ReNO-enhanced one-step models consistently surpass the performance of all current open-source Text-to-Image models. Extensive user studies demonstrate that our model is preferred nearly twice as often compared to the popular SDXL model and is on par with the proprietary Stable Diffusion 3 with 8B parameters. Moreover, given the same computational resources, a ReNO-optimized one-step model outperforms widely-used open-source models such as SDXL and PixArt-α, highlighting the efficiency and effectiveness of ReNO in enhancing T2I model performance at inference time.

MCML Authors
Link to website

Luca Eyring

Interpretable and Reliable Machine Learning

Link to website

Shyamgopal Karthik

Interpretable and Reliable Machine Learning

Link to website

Karsten Roth

Interpretable and Reliable Machine Learning

Link to Profile Zeynep Akata

Zeynep Akata

Prof. Dr.

Interpretable and Reliable Machine Learning

Poster Session at NeurIPS 2024

Poster Session at NeurIPS 2024

If you are interested in learning more about Zeynep Akatas group’s research, please visit EML Munich.

EML Munich

Share Your Research!


Get in touch with us!

Are you an MCML Junior Member and interested in showcasing your research on our blog?

We’re happy to feature your work—get in touch with us to present your paper.

13.03.2025


Subscribe to RSS News feed

Related

Link to Beyond the Black Box: Choosing the Right Feature Importance Method

27.03.2025

Beyond the Black Box: Choosing the Right Feature Importance Method

The team of Bernd Bischl created a clear guide to feature importance methods, helping researchers and practitioners interpret AI models effectively.

Link to Research at EWCS at the Broad Institute of MIT and Harvard

04.03.2025

Research at EWCS at the Broad Institute of MIT and Harvard

Research stay at Broad Institute: Exploring causality in biology, microbial interactions, and ML applications through MCML AI X-change program.

Link to ChatGPT in Radiology: Making Medical Reports Patient-Friendly?

23.02.2025

ChatGPT in Radiology: Making Medical Reports Patient-Friendly?

The team of MCML PI Michael Ingrisch explored whether ChatGPT could effectively simplify radiology reports while preserving factual accuracy.

Link to Can AI Help Solve Complex Physics Equations? Meet APEBench

11.02.2025

Can AI Help Solve Complex Physics Equations? Meet APEBench

Our Junior Member Felix Köhler, together with our PIs Rüdiger Westermann and Nils Thuerey have introduced APEBench, an innovative benchmark suite.

Link to Improving Business Processes

06.02.2025

Improving Business Processes

Gabriel Marques Tavares, MCML Junior Member and LMU PostDoc, explores Process Mining to optimize workflows and promote data-driven decisions.