Course Description
An introductory course in machine learning, with a focus on data modeling and related methods and learning algorithms for data sciences. Tentative topic list:
- Clustering, such as k-means, Gaussian mixture models, the expectation-maximization (EM) algorithm, link-based clustering. (We do not expect to cover hierarchical or spectral clustering.).
- Dimensionality reduction, such as principal component analysis (PCA) and the singular value decomposition (SVD), canonical correlation analysis (CCA), independent component analysis (ICA), compressed sensing, random projection, the information bottleneck. (We expect to cover some, but probably not all, of these topics).
- Probabilistic-modeling topics such as graphical models, latent-variable models, inference (e.g., belief propagation), parameter learning.
Can be taken independently or in any order with CS4780/5780 (Machine Learning for Intelligent Systems).
Prerequisites: probability theory (BTRY 3080, ECON 3130, MATH 4710, or strong performance in ENGRD 2700 or equivalent); linear algebra (MATH 2940 or equivalent); CS2110 or equivalent programming proficiency.
News (see also announcements on lecture handouts)
- Tuesday, August 22nd: The diagnostic assignment is out! Due on August 29th. Submit solutions here.
- Thursday, August 31st: Assignment 1 is out on CMS. Its due on 7th September 11:59pm. (submit on CMS)