Foundations of Modern Machine Learning
This course will cover fundamental topics in theory of machine learning for modern use, including statistical, computational, and social consideration. We start with a basic statistical and computational toolset required for understanding machine learning. We then explore a number of modern perspectives on machine learning including connections between game theory and machine learning, robustness of machine learning to adversaries, a beyond the worst-case analysis perspective on learning, and ethics in machine learning. In addressing these, the course makes connections to statistics, algorithms, complexity theory, optimization, game theory, and more.
Potential list of topics is subject to change but will likely include the following:
Offline and online learning, including VC theory, online learning, mistake bounds, etc.
Computational complexity of learning.
Boosting.
Connections between game theory and learning theory.
Generative Adversarial Networks.
Beyond the worst-case analysis of machine learning.
Learning with Statistical queries.
Differential privacy.
Fairness.
There are no formal pre-requisites for graduate student. Undergraduate students need to have taken CS 4820. Students require mathematical maturity and ease/familiarity with writing and understanding theorems and proofs. Familiarity with probability theory and basics of algorithms is required. No programming skills are required. Please see the instructor if you are unsure whether your background is suitable for the course.
There is no required textbook for this course. The following resources will be helpful for additional reading.
Name | Hours | Location | |
---|---|---|---|
Nika Haghtalab | nika@cs.cornell.edu | Fridays 10am-11am | Gates 315 |
Wilson Yoo | sy536@cornell.edu | Mondays 5:30pm-6:30pm | Rhodes 402 |
Abhishek Shetty | avs88@cornell.edu | Thursdays 4:30pm-5:30pm | Rhodes 412 |
You may also reach out for discussion/questions on our course Piazza page.
Date | Topic | Slides/Notes | Readings |
---|---|---|---|
1/21 | Introductions and Logistics | Slides | Chapter 1, UML |
1/23 | Consistency Model and Intro to PAC | Notes | Chapter 2, UML |
1/28 | Sample Complexity (finite hypothesis classes) | Notes | Chapter 2.2-3.1, UML |
1/30 | Combinatorial dimensions for learning | Notes | Chapter 6, UML |
02/04 | Sample Complexity (infinite hypothesis classes) | Notes | Chapter 6, UML |
02/06 | Sample complexity lower bounds | Notes | None |
02/11 | Finish PAC lower bound, intro to agnostic learning | Notes | Chapter 3.2, 4 |
02/13 | Agnostic Learning upper and lower bound | Notes | Chapter 28.2, UML |
02/18 | Introduction to hardness of learning | Notes | 6820 Notes on NP-Completeness |
02/20 | Representation Independent hardness | Notes | Daniely'16 |
02/25 | No Class -- Feb. Break | ||
02/27 | Mistake Bound and Weighted Majority | Notes | Chapter 21-21.2, UML |
03/03 | Learning from Experts | Notes | Blum and Mansour chapter |
03/05 | Online optimization I | Notes | |
03/10 | Follow The Regularized/Perturbed Leader | Notes | |
03/12 | Sequential Experimentation (Bobby Kleinberg) | Bobby's Slides, Notes | |
03/17-04/06 | Classes Suspended due to COVID-19 | ||
04/07 | FTPL Recap and Partial Information | Merged with 03/10 and 04/09 | |
04/9 | Multi-Armed Bandits | Notes | |
04/14 | Introduction to Boosting | Notes | |
04/16 | AdaBoost Error Analysis | Notes | |
04/21 | Boosting, Online Learning, and Games | Notes | |
04/23 | Oracle Efficient Online Learning I | Notes | |
04/28 | Generalized FTPL | Notes | |
04/30 | Learning in Presence of Noise | Notes | |
05/05 | Statistical Queries and Noise | Notes | |
05/07 | Differential Privacy | Notes | |
05/12 | Fairness in ML | Notes |
Every student will be responsible for scribing 1-2 lectures, based on the number of student enrolled in class. Scribing is worth 5% of your final grade.
During the add/drop period, i.e., until and including Feb 4, no scribing is needed. A form will be posted to sign up students for scribing after that period.
A lecture can be jointly scribed by two students. We highly encourage non-Ph.D. students to team up with a CS Ph.D. student for scribing. Please use the template and style file posted on Piazza resource page.
The first draft of the scribed notes are due 2 work days after the corresponding lecture, i.e., Tuesday lecture note are due on Thursday and Thursday lecture notes are due on the following Monday. Within these two days, the student has to also schedule a short (15-30 min) meeting with one of the TAs to go over the draft of the scribed notes and receive feedback. The final scribed notes should incorporate the TAs feedback and are due within 2 work days after the initial draft.
The project and homework schedule have been altered due to COVID-19 class suspension. HW3 has been cancelled and its weight is spread across the other homeworks. The due date for HW4, HW5, and final project are adjusted.
There will be five written homeworks and one project proposal and one final project. Written homeworks will involve deriving and proving mathematical results and critically analyzing the material presented in class.
Please submit your assignments on CMS here. We highly encourage you to typeset your submissions. You can use the provided TeX source as a basis for your submission. Any part of the submitted work that is not readable by the TAs will be ignored.
Solutions will be released about 3-5 days after the deadline via Box.
Homework | File | Posted Dates | Due Dates |
---|---|---|---|
Homework 1 | Piazza Link, TeX Template, solutions on CMS | 01/28 | 02/06 |
Homework 2 | Piazza Link, TeX Template | 02/11 | 2/20 |
Project Proposal | Piazza Link | 03/03 | 03/12 (last day to submit April 04/06) |
|
Cancelled due to COVID-19 | |
|
Homework 4 | Piazza Link | 04/14 | 04/27 |
Homework 5 | Piazza Link | 04/30 | 05/12 |
Final Project | 05/12 (5 days free extension) |
Homework is due on CMS by the posted deadline.
Each homework is worth 10% of the final grade.
You have a budget of 5 late days (i.e. 24 hour periods after the time the assignment was due) throughout the semester for which there is no late penalty. Beyond this 5-day budget, assignments turned in late will be charged a 1 percentage point reduction of the cumulated final homework grade for each period of 24 hours for which the assignment is late. No assignment will be accepted after the solution is made public, which is typically 3-5 days after the time it was due.
Regrade requests can be submitted on CMS upto 7 days after the solutions are released.
You can discuss the homework with other students, but all final submitted work must be done entirely on your own, without looking at any notes or pictures from the work you did during group discussions. Be sure to mention your collaborators' names and netIDs in your writeup.
Project proposal and final report are due on CMS by the posted deadline.
Project proposal is worth 5% and the final report is worth 20%
You can do the project in teams of at most 2 students.
The final report will be in style of a conference submission, with abstract, introduction, main body, and conclusions. The project needs to be typeset with LaTeX, using 11pt font and 1 inch margin on letter size paper. The main body of the paper can be at most 8 pages, not accounting for references and the appendices.
We will have a poster session as well.
The class does not have a midterm. The class has a take-home final exam that is to be done individually by the students with no outside help. Details of the exam will be announced at a later date.
Academic integrity is strictly enforced. You are allowed to discuss the homeworks with other student. But, do not take any notes, pictures, recording, etc. from your discussion. Your submission must be entirely your own work. You are allowed to consult online and textbook resources to achieve a deeper understanding of the topic. But, do not look up answers to homework problems and exams. Cite all resources, including online sources, on your submissions. Acknowledge the names of those you have discussed the problems with on your submissions.
Be careful of what you share on Piazza. Do not share your answers or provide hints on Piazza. If your questions may reveal part of the answer to a posted problem, then post your questions privately.
The final exam is to be done individually by each student with no help from others. You may not give or receive any assistance from anyone during the exam. You may consult the resources linked on this page, but you may not use any other material during the exam.
Additional academic integrity resources: