10
Jul
Colloquium
Variational Learning for Large Deep Networks
Thomas Möllenhoff, RIKEN, Tokyo
10.07.2024
3:15 pm - 4:45 pm
LMU Department of Statistics and via zoom
Thomas Möllenhoff presents extensive evidence against the common belief that variational Bayesian learning is ineffective for large neural networks.
First, he shows that a recent deep learning method called sharpness-aware minimization (SAM) solves an optimal convex relaxation of the variational Bayesian objective.
Then, he demonstrates that a direct optimization of the variational objective with an Improved Variational Online Newton method (IVON) can consistently match or outperforms Adam for training large networks such as GPT-2 and ResNets from scratch. IVON’s computational costs are nearly identical to Adam but its predictive uncertainty is better.
He shows several new use cases of variational learning where he improves fine-tuning and model merging in Large Language Models, accurately predict generalization error, and faithfully estimate sensitivity to data.
Organized by:
Department of Statistics LMU Munich
Related
Colloquium • 15.01.2025 • LMU Department of Statistics and via zoom
Additive Density-on-Scalar Regression in Bayes Hilbert Spaces With an Application to Gender Economics
15.01.25, 4-6 pm: LMU Statistics Colloquium with Sonja Greven (HU Berlin).
©jittawit.21 - stock.adobe.com
AI Keynote Series • 09.01.2025 • Online via Zoom
Experimental Designs for a/B Testing in Marketplaces
09.01.25, 12-1:30 pm: AI Keynote Series with Chengchun Shi from London School of Economics and Political Science.