21

Feb

Teaser image to Large-scale pretraining: the nitty-gritty details

Colloquium

Large-Scale Pretraining: The Nitty-Gritty Details

Robert Baldock, Aleph Alpha

   21.02.2024

   2:15 pm - 3:45 pm

   LMU Department of Statistics and via zoom

This talk will give a rare close-up of the nitty-gritty details that go into training large-scale LLMs. In the autumn of 2023, Aleph Alpha Research Lab prepared to train their next generation of large language models, which are training now.

In this talk, Robert Baldock will chronicle their learnings from this process. In particular, he will describe their experiments to optimise the architecture and pretraining, their optimal scaling study, insights about efficient and numerically stable parallel training, tokenizer construction, and the preparation of the large-scale web-crawl dataset.


Related

Link to Simplifying Debiased Inference via Automatic Differentiation and Probabilistic Programming

AI Keynote Series  •  13.02.2025  •  Online via Zoom

Simplifying Debiased Inference via Automatic Differentiation and Probabilistic Programming

13.02.25, 10-11:30 am: AI Keynote Series with Alex Luedtke from the University of Washington.


Link to A Novel Statistical Approach to Analyze Image Classification

Colloquium  •  29.01.2025  •  LMU Department of Statistics and via zoom

A Novel Statistical Approach to Analyze Image Classification

29.01.25, 4-6 pm: LMU Statistics Colloquium with Sophie Langer (U Twente) on faster, structured CNN-based image classification.


Link to Additive Density-on-Scalar Regression in Bayes Hilbert Spaces with an Application to Gender Economics

Colloquium  •  15.01.2025  •  LMU Department of Statistics and via zoom

Additive Density-on-Scalar Regression in Bayes Hilbert Spaces With an Application to Gender Economics

15.01.25, 4-6 pm: LMU Statistics Colloquium with Sonja Greven (HU Berlin) introducing a novel approach to modeling densities.