10

Jul

Teaser image to Variational Learning for Large Deep Networks

Variational Learning for Large Deep Networks

Thomas Möllenhoff, RIKEN, Tokyo

   10.07.2024

   3:15 pm - 4:45 pm

   LMU Department of Statistics and via zoom

Thomas Möllenhoff presents extensive evidence against the common belief that variational Bayesian learning is ineffective for large neural networks.

First, he shows that a recent deep learning method called sharpness-aware minimization (SAM) solves an optimal convex relaxation of the variational Bayesian objective.

Then, he demonstrates that a direct optimization of the variational objective with an Improved Variational Online Newton method (IVON) can consistently match or outperforms Adam for training large networks such as GPT-2 and ResNets from scratch. IVON’s computational costs are nearly identical to Adam but its predictive uncertainty is better.

He shows several new use cases of variational learning where he improves fine-tuning and model merging in Large Language Models, accurately predict generalization error, and faithfully estimate sensitivity to data.

Organized by:

Department of Statistics LMU Munich


Related

Link to Privacy, Data Privacy, and Differential Privacy

Colloquium  •  16.07.2024  •  LMU Department of Statistics and via zoom

Privacy, Data Privacy, and Differential Privacy

Colloquium at the LMU Department of Statistics with James Bailie from Harvard University.


Link to Can today’s intention to treat have a causal effect on tomorrow’s hazard function?

Colloquium  •  03.07.2024  •  LMU Department of Statistics and via zoom

Can today’s intention to treat have a causal effect on tomorrow’s hazard function?

Colloquium at the LMU Department of Statistics with Jan Beyersmann, University of Ulm.


Link to The Complexities of Differential Privacy for Survey Data

Colloquium  •  26.06.2024  •  LMU Department of Statistics and via zoom

The Complexities of Differential Privacy for Survey Data


Link to Resampling-based inference for the average treatment effect in observational studies with competing risks

Colloquium  •  19.06.2024  •  LMU Department of Statistics and via zoom

Resampling-based inference for the average treatment effect in observational studies with competing risks

This talk explores three resampling methods to construct valid confidence intervals and bands for treatment effect estimation in competing risks studies.


Link to Multiverse Analysis: On the Robustness of Functional Form and Data Pre-Processing
Decisions

Colloquium  •  05.06.2024  •  LMU Department of Statistics and via zoom

Multiverse Analysis: On the Robustness of Functional Form and Data Pre-Processing Decisions