21
Feb
Colloquium
Large-scale pretraining: the nitty-gritty details
Robert Baldock, Aleph Alpha
21.02.2024
2:15 pm - 3:45 pm
LMU Department of Statistics and via zoom
This talk will give a rare close-up of the nitty-gritty details that go into training large-scale LLMs. In the autumn of 2023, Aleph Alpha Research Lab prepared to train their next generation of large language models, which are training now.
In this talk, Robert Baldock will chronicle their learnings from this process. In particular, he will describe their experiments to optimise the architecture and pretraining, their optimal scaling study, insights about efficient and numerically stable parallel training, tokenizer construction, and the preparation of the large-scale web-crawl dataset.
Related
Colloquium • 05.02.2025 • LMU Department of Statistics and via zoom
TBA
Colloquium at the LMU Department of Statistics with Isabel Valera (Saarland University in Saarbrücken).
Colloquium • 29.01.2025 • LMU Department of Statistics and via zoom
TBA
Colloquium at the LMU Department of Statistics with Sophie Langer (University of Twente).
Colloquium • 15.01.2025 • LMU Department of Statistics and via zoom
TBA
Colloquium at the LMU Department of Statistics with Sonja Greven (HU Berlin).
Colloquium • 11.12.2024 • LMU Department of Statistics and via zoom
TBA
Colloquium at the LMU Department of Statistics with Stijn Vansteelandt (Ghent University).
Munich AI Lectures • 25.11.2024 • Große Aula der LMU Geschwister-Scholl-Platz 1, Room 120 80539 München
The Mathematical Universe behind Deep Neural Networks
Join us on Nov 25 for Prof. Helmut Bölcskei’s lecture on the mathematical foundations driving deep neural networks, hosted by Bavarian AI at LMU.