CS3780/5780 Introduction to Machine Learning, Cornell University

CS3780/5780 - Introduction to Machine Learning

Fall 2024
Prof. Sarah Dean & Prof. Thorsten Joachims & Prof. John Thickstun
Cornell University, Department of Computer Science

 

Time and Place

First lecture: August 27, 2024
Time: Tuesday/Thursday, 1:25pm - 2:40pm
Room: Uris Hall G01

Mid-term Exam: October 10, 7:30pm
Final Exam: TBD

Link to Canvas Page

Course Description

Machine learning is concerned with the question of how to make computers learn from experience. The ability to learn is not only central to most aspects of intelligent behavior, but machine learning techniques have become key components of many software systems. For examples, machine learning techniques are used to build search engines, to recommend movies, to understand natural language and images, and to build autonomous robots. This course will introduce the fundamental set of techniques and algorithms that constitute machine learning as of today. The course will not only discuss individual algorithms and methods, but also tie principles and approaches together from a theoretical perspective. In particular, the course will cover the following topics:

  • Supervised Batch Learning: model, decision theoretic foundation, model selection, model assessment, empirical risk minimization
  • Instance-based Learning: K-Nearest Neighbors, collaborative filtering
  • Decision Trees: TDIDT, attribute selection, pruning and overfitting, boosting, bagging
  • Linear Rules: Perceptron, logistic regression, linear regression, duality
  • Support Vector Machines: Optimal hyperplane, margin, kernels, stability
  • Deep Learning: multi-layer perceptrons, deep networks, stochastic gradient, transformers, large language models
  • Generative Models: generative vs. discriminative, naive Bayes, linear discriminant analysis
  • Structured Output Prediction: predicting sequences, hidden markov model, rankings
  • Statistical Learning Theory: generalization error bounds, VC dimension
  • Unsupervised Learning: k-means clustering, hierarchical agglomerative clustering, principal component analysis

The prerequisites for the class are: Probability theory (e.g., BTRY 3080, ECON 3130, MATH 4710, ENGRD 2700, CS 2800), linear algebra (e.g., MATH 2210, MATH 2940, MATH 2310), single-variable calculus (e.g. MATH 1910, MATH 1110), and programming proficiency (e.g., CS 2110).

Forbidden overlaps: ECE 3200 (previously ECE 4200), ORIE 3741 (previously ORIE 4741), STSCI 3740 (previously STSCI 4740).

 

Reference Material

The main textbook for the class is:

  • Shai Shalev-Shwartz, Shai Ben-David, "Understanding Machine Learning - From Theory to Algorithms", Cambridge University Press, 2014. (online)

For additional reading, here is a list of other sources:

  • Faisal, Ong, Deisenroth, "Mathematics for Machine Learning", Cambridge University Press, 2020. (online)
  • Tom Mitchell, "Machine Learning", McGraw Hill, 1997.
  • Kevin Murphy, "Machine Learning - a Probabilistic Perspective", MIT Press, 2012. (online via Cornell Library)
  • Cristianini, Shawe-Taylor, "Introduction to Support Vector Machines", Cambridge University Press, 2000. (online via Cornell Library)
  • Schoelkopf, Smola, "Learning with Kernels", MIT Press, 2001. (online)
  • Bishop, "Pattern Recognition and Machine Learning", Springer, 2006.
  • Ethem Alpaydin, "Introduction to Machine Learning", MIT Press, 2004.
  • Duda, Hart, Stork, "Pattern Classification", Wiley, 2000.
  • Hastie, Tibshirani, Friedman, "The Elements of Statistical Learning", Springer, 2001.
  • Imbens, Rubin, Causal Inference for Statistical Social Science, Cambridge, 2015. (online via Cornell Library)
  • Leeds Tutorial on HMMs (online)
  • Manning, Schuetze, "Foundations of Statistical Natural Language Processing", MIT Press, 1999. (online via Cornell Library)
  • Manning, Raghavan, Schuetze, "Introduction to Information Retrieval", Cambridge, 2008. (online)
  • Vapnik, "Statistical Learning Theory", Wiley, 1998.