Term | Spring 2021 | Instructor | Christopher De Sa |
Course website | www.cs.cornell.edu/courses/cs4787/2021sp/ | cdesa@cs.cornell.edu | |
Schedule | MW 7:30-8:45PM | Office hours | Wednesdays 2PM |
Room | Zoom | Office | Zoom |
[Canvas] [Discussion] [CMS]
Description: CS4787 explores the principles behind scalable machine learning systems. The course will cover the algorithmic and the implementation principles that power the current generation of machine learning on big data. We will cover training and inference for both traditional ML algorithms such as linear and logistic regression, as well as deep models. Topics will include: estimating statistics of data quickly with subsampling, stochastic gradient descent and other scalable optimization methods, mini-batch training, accelerated methods, adaptive learning rates, methods for scalable deep learning, hyperparameter optimization, parallel and distributed training, and quantization and model compression.
Prerequisites: CS4780 or equivalent, CS 2110 or equivalent
Format: Lectures during the scheduled lecture period will cover the course content. Problem sets will be used to encourage familiarity with the content and develop competence with the more mathematical aspects of the course. Programming assignments will help build intuition and familiarity with how machine learning algorithms run. There will be one midterm exam and one final exam, each of which will test both theoretical knowledge and programmming implementation of concepts.
Material: The course is based on books, papers, and other texts in machine learning, scalable optimization, and systems. Texts will be provided ahead of time on the website on a per-lecture basis. You aren't expected to necessarily read the texts, but they will provide useful background for the material we are discussing.
Grading: Students will be evaluated on the following basis.
20% | Problem sets |
40% | Programming assignments |
15% | Prelim Exam |
25% | Final Exam |
Inclusiveness: You should expect and demand to be treated by your classmates and the course staff with respect. You belong here, and we are here to help you learn—and enjoy—this course. If any incident occurs that challenges this commitment to a supportive and inclusive environment, please let the instructor know so that we can address the issue. We are personally committed to this, and subscribe to the Computer Science Department's Values of Inclusion.
Course calendar may be subject to change.
Monday, February 8 Feb 7Feb 8Feb 9Feb 10Feb 11Feb 12Feb 13 |
Lecture 1. Introduction and course overview. [Notes]
Problem Set 1 Released. |
Wednesday, February 10 Feb 7Feb 8Feb 9Feb 10Feb 11Feb 12Feb 13 |
Lecture 2. Linear algebra done efficiently: Mapping mathematics to numpy. [Slides Notebook] [Slides HTML] |
Monday, February 15 Feb 14Feb 15Feb 16Feb 17Feb 18Feb 19Feb 20 |
Lecture 3. Scaling to complex models by learning with optimization algorithms. Gradient descent, convex optimization and conditioning. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Programming Assignment 1 Released. Background reading material:
|
Wednesday, February 17 Feb 14Feb 15Feb 16Feb 17Feb 18Feb 19Feb 20 |
Lecture 4. Gradient descent continued. Stochastic gradient descent. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Background reading material:
|
Monday, February 22 Feb 21Feb 22Feb 23Feb 24Feb 25Feb 26Feb 27 |
Lecture 5. Stochastic gradient descent continued. Scaling to huge datasets with subsampling. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Problem Set 1 Due. Background reading material:
|
Wednesday, February 24 Feb 21Feb 22Feb 23Feb 24Feb 25Feb 26Feb 27 |
Lecture 6. Adapting algorithms to hardware. Minibatching and the effect of the learning rate. Our first hyperparameters. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Problem Set 2 Released. Note that this is a half-length problem set designed to be done in 1 week rather than 2, so that it can be finished before the prelim exam. Background reading material:
|
Monday, March 1 Feb 28Mar 1Mar 2Mar 3Mar 4Mar 5Mar 6 |
Lecture 7. The mathematical hammers behind subsampling. Estimating large sums with samples, e.g. the empirical risk. Concentration inequalities. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Background reading material:
|
Wednesday, March 3 Feb 28Mar 1Mar 2Mar 3Mar 4Mar 5Mar 6 |
Lecture 8. Optimization techniques for efficient ML. Accelerating SGD with momentum. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Problem Set 2 Due.In order for late days to not conflict with the prelim, this problem set can be submitted late until Monday, March 8 with no penalty. Background reading material:
|
Thursday, March 4 Feb 28Mar 1Mar 2Mar 3Mar 4Mar 5Mar 6 |
Prelim Exam. 8:30PM. Exam released on Gradescope and on Canvas. The exam may cover topics up to Lecture 8, including scalability in ML, gradient descent, stochastic gradient descent, convexity and strong convexity, the computational cost of learning algorithms, concentration inequalities, momentum, and writing learning algorithms in numpy. |
Monday, March 8 Mar 7Mar 8Mar 9Mar 10Mar 11Mar 12Mar 13 |
Lecture 9. Optimization techniques for efficient ML, continued. Accelerating SGD with preconditioning and adaptive learning rates. [Notes] [Slides Notebook] [Slides HTML]
Programming Assignment 2 Released. Background reading material:
|
Wednesday, March 10 Mar 7Mar 8Mar 9Mar 10Mar 11Mar 12Mar 13 |
Wellness Day. No Classes. No lecture. |
Monday, March 15 Mar 14Mar 15Mar 16Mar 17Mar 18Mar 19Mar 20 |
Lecture 10. Optimization techniques for efficient ML, continued. Accelerating SGD with variance reduction and averaging. [Notes] [Slides Notebook] [Slides HTML]
Problem Set 3 Released. Background reading material:
|
Wednesday, March 17 Mar 14Mar 15Mar 16Mar 17Mar 18Mar 19Mar 20 |
Lecture 11. Dimensionality reduction and sparsity. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Background reading material:
|
Monday, March 22 Mar 21Mar 22Mar 23Mar 24Mar 25Mar 26Mar 27 |
Lecture 12. Deep neural networks. Matrix multiply as computational core of learning. [Notes] [Demo Notebook] [Demo HTML]
Programming Assignment 3 Released. Background reading material:
|
Wednesday, March 24 Mar 21Mar 22Mar 23Mar 24Mar 25Mar 26Mar 27 |
Lecture 13. Automatic differentiation and ML frameworks. [Notes] [Demo Notebook] [Demo HTML]
Background reading material:
|
Monday, March 29 Mar 28Mar 29Mar 30Mar 31Apr 1Apr 2Apr 3 |
Lecture 14. Accelerating DNN training: early stopping and batch normalization. [Notes] [Demo Notebook] [Demo HTML]
Problem Set 3 Due. Problem Set 4 Released. Background reading material:
|
Wednesday, March 31 Mar 28Mar 29Mar 30Mar 31Apr 1Apr 2Apr 3 |
Lecture 15. Hyperparameter optimization. Grid search. Random search. [Notes] [Slides Notebook] [Slides HTML]
Background reading material:
|
Monday, April 5 Apr 4Apr 5Apr 6Apr 7Apr 8Apr 9Apr 10 |
Lecture 16. Kernels and kernel feature extraction. [Notes] [Slides Notebook] [Slides HTML]
Programming Assignment 4 Released. Background reading material:
|
Wednesday, April 7 Apr 4Apr 5Apr 6Apr 7Apr 8Apr 9Apr 10 |
Lecture 17. Bayesian optimization 1. [Notes]
Background reading material:
|
Monday, April 12 Apr 11Apr 12Apr 13Apr 14Apr 15Apr 16Apr 17 |
Lecture 18. Bayesian optimization 2. [Notes]
Problem Set 4 Due. Problem Set 5 Released. Background reading material: same as Bayesian optimization 1. |
Wednesday, April 14 Apr 11Apr 12Apr 13Apr 14Apr 15Apr 16Apr 17 |
Lecture 19. Parallelism. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]
Background reading material:
|
Monday, April 19 Apr 18Apr 19Apr 20Apr 21Apr 22Apr 23Apr 24 |
Lecture 20. Memory locality and memory bandwidth. [Notes] [Slides Notebook] [Slides HTML]
Programming Assignment 5 Released. Background reading material: same as Parallelism 1. |
Wednesday, April 21 Apr 18Apr 19Apr 20Apr 21Apr 22Apr 23Apr 24 |
Lecture 21. Machine learning on GPUs; matrix multiply returns. [Notes] [Slides Notebook] [Slides HTML]
Background reading material:
|
Monday, April 26 Apr 25Apr 26Apr 27Apr 28Apr 29Apr 30May 1 |
Wellness Day. No Classes. No lecture. |
Wednesday, April 28 Apr 25Apr 26Apr 27Apr 28Apr 29Apr 30May 1 |
Lecture 22. Quantized, low-precision machine learning. [Notes] [Slides] [Demo Notebook] [Demo HTML]
Problem Set 5 Due. Problem Set 6 Released. Background reading material:
|
Monday, May 3 May 2May 3May 4May 5May 6May 7May 8 |
Lecture 23. Distributed learning and the parameter server. [Notes] [Slides]
Programming Assignment 6 Released. Background reading material:
|
Wednesday, May 5 May 2May 3May 4May 5May 6May 7May 8 |
Lecture 24. Deployment and low-latency inference. Deep neural network compression and pruning. [Notes] [Slides] [Demo Notebook] [Demo HTML]
Background reading material:
|
Monday, May 10 May 9May 10May 11May 12May 13May 14May 15 |
Lecture 25. Online Learning and Realtime Learning. [Notes] [Slides Notebook] [Slides HTML]
Background reading material:
|
Wednesday, May 12 May 9May 10May 11May 12May 13May 14May 15 |
Lecture 26. Machine learning accelerators, and Course Summary. [Notes] [Slides Notebook] [Slides HTML]
Problem Set 6 Due. Background reading material:
|
Friday, May 14 May 9May 10May 11May 12May 13May 14May 15 |
(No lecture.) |
Tuesday, May 18 May 16May 17May 18May 19May 20May 21May 22 |
Final Exam. 9:30AM. Exam released on Gradescope and on Canvas. The exam may include any topics covered in the course. |