Attention, Transformers, and LLMs: a hands-on introduction in Pytorch
In this 3-hour workshop, we will build a large language model from the ground up using Pytorch. This workshop focuses on the fundamentals of attention and the transformer architecture. We will assume a foundation in basic calculus, linear algebra, optimization, and python programming. Experience using Pytorch will be helpful, but is not strictly required.
Tentative agenda:
- Overview of the language modeling task
- Attention: Query, Key, and Value
- Self Attention
- Positional Encodings
- The Transformer Architecture
- Autoregressive Models
- Fitting our LLM
- Huggingface: a high-level API for LLMs
- Conclusions
Live Workshop
Session #1 for Fall 2024
Date: Tuesday, November 12, 2024
Time: 9:00 AM — 12:00 PM (3 hours)
Location: WFIC 106