LSML 26: Large Scale Machine Learning
PSL week Spring Course 2026
C2MINES-11 Large-Scale Machine Learning
March 2-6, 2026
Mines Paris, 60 boulevard Saint-Michel, 75006 Paris Room 108B
This course is co-organized by Gabriel Victorino Cardoso and Fabien Moutarde.
Outline
Machine learning is a fast-growing field at the interface of mathematics, computer science and engineering, which provides computers with the ability to learn without being explicitly programmed, in order to make predictions or take rational actions. From cancer research to finance, natural language processing, marketing or self-driving cars, many fields are nowadays impacted by recent progress in machine learning algorithms that benefit from the ability to collect huge amounts of data and "learn" from them.
The goal of this intensive 5-day advanced course is to present the theoretical foundations and practical algorithms to implement and solve modern machine learning problems, and to expose the students to current applications and challenges of Machine Learning in science and industry.
Prerequisites:
- Numerical Python (ie familiarity with programming in Python and the numpy, scipy, matplotlib librairies).
- Basics of machine learning (such as the content of the Apprentissage Artificiel course for Mines Paris – PSL students).
- Familiarity with scikit-learn and pytorch
Schedule
Monday, March 2, 2026
- 09:00 – 09:15 Course presentation [slides (pdf)]
- 09:15 – 12:15 Lecture: Introduction to large-scale ML & optimization (Katia Antonenko) [slides (pdf)]
- 13:45 – 17:00 Practical session: ML on large data with scikit-learn. This session will also contain an introduction to scikit-learn for those who have not used the library before.
Tuesday, March 3, 2026
- 09:00 – 12:15 Lecture: Deep unsupervised learning and generative models (Gabriel Victorino Cardoso) [slides (pdf)]
- 13:45 – 17:00 Practical session: Deep learning, autoencoders and Diffusion models with Python.
Wednesday, March 4, 2026
- 09:00 – 12:15 Lecture: Deep Reinforcement Learning (Fabien Moutarde) [slides (pdf)]
- 13:45 – 17:00 Practical session: Deep Reinforcement Learning with Python.
Thursday, March 5, 2026
- 09:00 – 12:15 Lecture: LLM and Post-training (Pierre Clavier, Cohere)[ slides (pdf)]
- 13:45 – 17:00 Practical session: Post-training of LLMs with Python..
Friday, March 6, 2026
- 09:00 – 12:15 Lecture: Stochastic optimization and Double descent with Neural networks (Claire Boyer, Université Paris-Saclay)
- 13:45 – 17:00 Practical session: Double descent on Pytorch
All course materials will be in English but some lectures might be given in French.
Grade
If you are taking this class for credit, you will be ask to turn in the notebooks of all your practical sessions.
Total credits: 2 ECTS.
Practical sessions
Practical sessions will take the form of Jupyter notebooks on the github repository.
Please follow the instructions there to install Python3 and all the relevant packages. An alternative (sometimes preferable for deep learning notebooks) is to use Google Colab, for which you will need a Google account.
TAs: Victor Gertner (CBIO), Victor Laigle (CBIO), Mihaela Dimitrova (STIM), Waël Doulazmi (CAOR).
Textbook
There is no single textbook for this course, but the following resources are relevant:
- Mining of massive datasets by Leskovec, Rajaraman and Ullman;
- Deep learning by Goodfellow, Bengio and Courville;
- Large-Scale Optimization: Beyond Stochastic Gradient Descent and Convexity by Sra and Bach.
This course is not an introductory course to machine learning! If you want to learn the basics, or need a refresher, we recommend:
- In French, Introduction au Machine Learning by Azencott;
- In French, Apprentissage statistique supervisé by Moutarde in Techniques de l'Ingénieur;
- In English, Machine learning by Andrew Ng on Coursera;
- In English, Introduction to statistical learning by James, Witten, Hastie, Tibshirani and Taylor;
- In English, Pattern recognition and machine learning by Bishop.