1. Introductory notes and discussion on large language models
Feb 22
Slides
Covered topics: aims of the course, passing requirements. We discussed
what are (large) language models, what are they for, what are their benefits
and downsides. We concluded with a rough analysis of ChatGPT performance in
different languages.
Mar 7
Lecture notes
Slides
After the class, you should be able to:
- Explain the building blocks of the Transformer architecture to a non-technical person
- Describe the Transformer architecture using equations, especially the self-attention block
- Implement the Transformer architecture (in PyTorch or another framework with automated differentiation)
Class outline:
Additional materials:
3. LLM Training
Mar 14
Slides
Recording
After the class, you should be able to:
- Give a high-level description of how neural networks are trained
- Read and understand a neural training library documentation
- Explain the differences between various training techniques used in LLMs today
Class outline:
- Rest of the discussion on Transformers, see above
- General introduction into neural network & transformer model training, pretrained models, RLHF, DPO
Additional materials:
4. LLM Inference
Mar 21
Slides
Code
Recording
After the class, you should be able to:
- Give a high-level description of how a transformer predicts a probability distribution for the next token in the sequence
- Select the appropriate decoding algorithm for your use-case and understand its parameters
- Write a Python code snippet for generating text with an open language model using the
transformers
library
Class outline:
- Discussion, LLM zoo
- 3D visualization of transformer inference
- Decoding algorithms - exact inference (MAP), greedy search, beam search, top-k, top-p, Mirostat, locally typical sampling
- Hands-on demonstration of text generation with the
transformers
library
- Bonus: non-autoregressive decoding, reverse-engineering decoding algorithms
Additional materials:
5. Generating Weather Reports
Mar 28
Assignment
Assignment #1
After the class, you should be able to:
- Write a basic Python code querying a LLM through an OpenAI-like API.
- Set up a suitable prompt and parameters to get the expected output.
- Describe what are the opportunities and limits of recent open LLMs.
Class outline:
- Introduction
- Working on the assignment
Additional materials:
6. Data and Evaluation
Apr 4
Lecture notes
After the class, you should be able to:
- Look for a dataset for a specified NLP task and find one (given the task is reasonably common)
- Roughly assess the usefulness of the dataset based on its statistics
- Pick an evaluation method that suits the task
- Have a sense of what a "reasonable" score in that task might look like
Class outline:
- Data for language modeling
- NLP tasks and data (introduction + team work)
- Evaluation (introduction + team work)
Additional materials:
7. Evaluation, Working with the Models
Apr 11
MCQA Evaluation
Speech Translation
LLMs for Machine Translation
Chain-of-thought Prompting; RAG
Generation; Evaluation; Web navigation
Experience with LLMs
Recording
Class outline:
- Remarks on LLM evaluation on multiple-choice question answering task
- Speech translation challenges
- Using LLMs for machine translation
- Chain-of-thought prompting, retrieval-augmented generation
- Generation, evaluation and Web navigation using LLMs
- Experience with using LLMs within the EDU-AI project, Task-oriented Dialogue
8. LLM Efficiency
Apr 18
Assignment review
Efficiency
Recording
After the class, you should be able to:
- Identify technical bottlenecks constraining inference and training with LLMs
- Know methods enabling the usage LLMs under computational restrictions:
- parameter efficient fine-tuning,
- quantization,
- picking the right model scale for your data.
Class outline:
- Assignment 1 review
- Time and space requirements of LLMs
- Low-rank adaptation
- Quantization
- Scaling
9. Multilinguality
Apr 25
Slides
Recording
Assignment
Assignment #2
After the class, you should be able to:
- Name benefits of multilingual language models and cross-lingual transfer.
- Pick the multilingual model suitable for a specific language based on training data, similar languages covered and tokenizer properties.
Class outline:
- Guided discussions: why do we train multilingual LMs? How to train multilingual LMs?
- Availability of data throughout languages, resourcefulness levels.
- Variability of languages: typology and writing systems
- Multilingual tokenization
- Application of LLMs for machine translation
10. LLMs for Speech-to-Text
May 2
Slides
After the class, you should know:
- Motivation for speech in LLMs
- The basic and example speech-to-text methods
- Real-time methods
Class outline:
- Speech NLP tasks (ASR, translation, emotion recognition, …)
- Speech in NNs (sound representation, MFCC, raw audio) and in LLMs (Wav2vec,
HuBERT, Whisper)
- Simultaneous methods: re-translation vs. incremental
- Streaming policies wait-k and LocalAgreement
- Whisper-Streaming and ELITR demo
Active participation
There will be two or three tasks during the semester; we will work on them mainly
during classes but they might turn into a (small) homework.
Reading assignments
You will be asked at least once to read a paper before the class.
Final written test
You need to take part in a final written test that will not be graded.