Home  | Publications | DSO+25

CAGE: Unsupervised Visual Composition and Animation for Controllable Video Generation

MCML Authors

Link to Profile Björn Ommer PI Matchmaking

Björn Ommer

Prof. Dr.

Principal Investigator

Abstract

In this work we propose a novel method for unsupervised controllable video generation. Once trained on a dataset of unannotated videos, at inference our model is capable of both composing scenes of predefined object parts and animating them in a plausible and controlled way. This is achieved by conditioning video generation on a randomly selected subset of local pre-trained self-supervised features during training. We call our model CAGE for visual Composition and Animation for video GEneration. We conduct a series of experiments to demonstrate capabilities of CAGE in various settings.

inproceedings


AAAI 2025

39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025.
Conference logo
A* Conference

Authors

A. Davtyan • S. Sameni • B. Ommer • P. Favaro

Links

DOI GitHub

Research Area

 B1 | Computer Vision

BibTeXKey: DSO+25

Back to Top