CS3780/5780 Introduction to Machine Learning, Cornell University

CS3780/5780 Introduction to Machine Learning, Cornell University

	CS3780/5780 - Introduction to Machine Learning Fall 2024 Prof. Sarah Dean & Prof. Thorsten Joachims & Prof. John Thickstun Cornell University, Department of Computer Science

	Time and Place First lecture: August 27, 2024 Time: Tuesday/Thursday, 1:25pm - 2:40pm Room: Uris Hall G01 Mid-term Exam: October 10, 7:30pm Final Exam: TBD Link to Canvas Page
	Course Description Machine learning is concerned with the question of how to make computers learn from experience. The ability to learn is not only central to most aspects of intelligent behavior, but machine learning techniques have become key components of many software systems. For examples, machine learning techniques are used to build search engines, to recommend movies, to understand natural language and images, and to build autonomous robots. This course will introduce the fundamental set of techniques and algorithms that constitute machine learning as of today. The course will not only discuss individual algorithms and methods, but also tie principles and approaches together from a theoretical perspective. In particular, the course will cover the following topics: Supervised Batch Learning: model, decision theoretic foundation, model selection, model assessment, empirical risk minimization Instance-based Learning: K-Nearest Neighbors, collaborative filtering Decision Trees: TDIDT, attribute selection, pruning and overfitting, boosting, bagging Linear Rules: Perceptron, logistic regression, linear regression, duality Support Vector Machines: Optimal hyperplane, margin, kernels, stability Deep Learning: multi-layer perceptrons, deep networks, stochastic gradient, transformers, large language models Generative Models: generative vs. discriminative, naive Bayes, linear discriminant analysis Structured Output Prediction: predicting sequences, hidden markov model, rankings Statistical Learning Theory: generalization error bounds, VC dimension Unsupervised Learning: k-means clustering, hierarchical agglomerative clustering, principal component analysis The prerequisites for the class are: Probability theory (e.g., BTRY 3080, ECON 3130, MATH 4710, ENGRD 2700, CS 2800), linear algebra (e.g., MATH 2210, MATH 2940, MATH 2310), single-variable calculus (e.g. MATH 1910, MATH 1110), and programming proficiency (e.g., CS 2110). Forbidden overlaps: ECE 3200 (previously ECE 4200), ORIE 3741 (previously ORIE 4741), STSCI 3740 (previously STSCI 4740).
	Reference Material The main textbook for the class is: Shai Shalev-Shwartz, Shai Ben-David, "Understanding Machine Learning - From Theory to Algorithms", Cambridge University Press, 2014. (online) For additional reading, here is a list of other sources: Faisal, Ong, Deisenroth, "Mathematics for Machine Learning", Cambridge University Press, 2020. (online) Tom Mitchell, "Machine Learning", McGraw Hill, 1997. Kevin Murphy, "Machine Learning - a Probabilistic Perspective", MIT Press, 2012. (online via Cornell Library) Cristianini, Shawe-Taylor, "Introduction to Support Vector Machines", Cambridge University Press, 2000. (online via Cornell Library) Schoelkopf, Smola, "Learning with Kernels", MIT Press, 2001. (online) Bishop, "Pattern Recognition and Machine Learning", Springer, 2006. Ethem Alpaydin, "Introduction to Machine Learning", MIT Press, 2004. Duda, Hart, Stork, "Pattern Classification", Wiley, 2000. Hastie, Tibshirani, Friedman, "The Elements of Statistical Learning", Springer, 2001. Imbens, Rubin, Causal Inference for Statistical Social Science, Cambridge, 2015. (online via Cornell Library) Leeds Tutorial on HMMs (online) Manning, Schuetze, "Foundations of Statistical Natural Language Processing", MIT Press, 1999. (online via Cornell Library) Manning, Raghavan, Schuetze, "Introduction to Information Retrieval", Cambridge, 2008. (online) Vapnik, "Statistical Learning Theory", Wiley, 1998.