Aligning discrete diffusion models with downstream rewards remains challenging: step-wise guidance is myopic and degrades sample quality, while fine-tuning is expensive and task-specific. We introduce Discrete Diffusion Noise Optimization (DDNO), a training-free method that instead optimizes the initial discrete noise to maximize terminal rewards while keeping the generator frozen. DDNO parameter- izes the noise distribution with continuous logits and propagates gradients through the reverse process via a straight-through surrogate combined with soft mixing, enabling stable optimization over long denoising trajectories. On compositional text-to-image synthesis and controllable text generation, DDNO consistently out- performs inference-time baselines like guidance and Best-of-N while exhibiting favorable scaling. This positions DDNO as a promising axis for test-time scaling in discrete generative models, complementing advances in continuous diffusion.
inproceedings EPB+26
BibTeXKey: EPB+26