Schedule

This schedule should be considered tentative and subject to change, at least until it actually takes place!

Week	Date	Notes, Readings, and HW
1	Tue, Feb 09	Introduction NO, sec 2.1 Getting-to-know you survey on Canvas Meeting notes
	Thu, Feb 11	Optimization and linear algebra refresher ESL, sec 3.1-3.2 ALA, sec 3.2-3.2 Meeting notes
2	Tue, Feb 16	Regularized linear least squares ESL, sec 3.4 ALA, sec 3.5 NO, sec 3.3 Rank Revealing QR Factorizations, T.F. Chan, LAA, 1987. Meeting notes Julia notebook
	Thu, Feb 18	Sparse least squares and iterations Ch 2 of Templates for the Solution of Linear Systems, Barret et al. LSQR: An algorthm for spare linear equations and sparse least squares, Paige and Saunders, ACM TOMS, 1982. LSMR: An iterative algorithm for sparse least-squares problems, Fong and Saunders, SISC, 2011. Meeting notes
3	Tue, Feb 23	Stochastic gradients, scaling, and Newton Optimization Methods for Large-Scale Machine Learning, Bottou, Curtis, and Nocedal, SIREV Meeting notes
	Thu, Feb 25	Randomized numerical linear algebra Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, Halko, Martinsson, and Tropp, SIREV, 2011. LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems, Meng, Saunders, Mahoney, SISC 2014 Sec 5, Lectures on Randomized NLA, Drineas and Mahoney
4	Tue, Mar 02	Latent factor models The Advanced Matrix Factorization Jungle, I. Carron A two-stage linear discriminant analysis via QR decomposition, Ye and Li ESL, 3.5.1 and 14.5 Meeting notes
	Thu, Mar 04	SVD and other low-rank decompositions On the relationships between SVD, KLT, and PCA, Gerbrands, Pattern Recognition, 1981 Trace optimization and eigenproblems in dimension reduction methods, Kikiopoulou, Chen, and Saad, NLAA 2010 On the compression of low rank matrices, Cheng, Gimbutas, Martinsson, and Rokhlin, SISC 2005 CUR matrix decompositions for improved data analysis, Mahoney and Drineas, PNAS 2009 Meeting notes
5	Tue, Mar 09	Wellness day
	Thu, Mar 11	Non-negative matrix factorization Nonnegative Matrix Factorization (Gillis), Chapter 1 The Whys and Hows of NMF, Gillis Learning the parts of objects by non-negative matrix factorization, Lee and Seung, Nature, 1999 Computing a nonnegative matrix factorization – provably, Arora, Ge, Kannan, and Moitra, SICOMP, 2016 When Does NMF Give a Correct Decomposition into Parts?, Donoho and Stodden, NeurIPS, 2003 Algorithms for NMF and NTFs: a unified view based on block coordinate descent framework, Kim, He, and Park, J. Glob. Optim, 20113 Meeting notes
6	Tue, Mar 16	Tensor basics, HOSVD, Tucker, and ALS Tensor Decompositions and Applications, Kolda and Bader, SIREV, 2009 Tensor Computations and Applications in Data Mining, Elden, slides from SIAM AM 2008 From Matrix to Tensor, Van Loan, slides from Cornell CS colloquium Tensors for Data Mining and Data Fusion, Papalexakis, Faloutsos, and Sidriropoulos, ACM TIS, 2016 Meeting notes
	Thu, Mar 18	CP decomposition and algorithms, CUR and tensor trains Tensor Decompositions and Applications, Kolda and Bader, SIREV, 2009 Tensor Decompositions: A Mathematical Tool for Data Analysis, Kolda, slides from JMM 2018 Epsilon-ALS for Orthogonal Low-Rank Tensor Approximation, Yang, SIMAX 2020 Low Multilinear Rank Approximations of Tensors, Che, Wei, and Yan, SIMAX 2020 Low-Rank Approximation in the Frobenius Norm by Column and Row Subset Selection, Cortinovis and Kressner, SIMAX 2020 Stochastic Gradients for Large-Scale Tensor Decomposition, Kolda and Hong, SIMODS 2020 Exercise notebook
7	Tue, Mar 23	Nonlinear dimensionality reduction A global geometric framework for nonlinear dimensionality reduction, Tenenbaum, de Silva, and Langford, Science 2000 Nonlinear dimensionality reduction by locally linear embedding, Roweis and Saul, Science 2000 Visualizing Data using t-SNE, van der Maaten and Hinton, JMLR 2008 Dimensionality Reduction: A Comparitive Review, van der Maaten, Postma, and van den Herik, Tech report 2009 Dimension Reduction: A Guided Tour, Burges, FTML 2009 Global versus local methods in nonlinear dimensionality reduction, de Silva and Tenenbaum, NeurIPS 2003 Large-scale SVD and manifold learning, Talwalkar, Kumar, Mohri, and Rowley, JMLR 2013 Accelerating t-SNE using tree-based algorithms, van der Maaten, JMLR 2014
	Thu, Mar 25	Function approximation fundamentals Nonlinear Approximation, DeVore, Acta Numerica 1998 - long, but please do read sections 1 and 9 at least Approximation Theory and Approximation Practice, Trefethen, SIAM 2019 - a beautiful text, focused on polynomial and rational approximation in 1D; useful to skim, don’t consider it assigned reading A Course in Approximation Theory, Cheney and Light, AMS 2009 - again, not considered assigned reading (unless you want to do DNN approximation, in which case please read ch 23-25) Class notebook
8	Tue, Mar 30	Low-dim structure in function approximation Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies, Constantine, SIAM 2015 Active Subspace Methods in Theory and Practice: Applications to Kriging Surfaces, Constantine, Dow, and Wang, SISC 2014 Active Manifolds: a non-linear analogue to Active Subspaces, Bridges, Gruber, Felder, Verma, Hoff, ICML 2019 Constrained global optimization of functions with low effective dimensionality using multiple random embeddings, Cartis, Massart, Otemissov, arXiv 2020
	Thu, Apr 01	Low-dim structure in function approximation Approximation of high-dimensional parametric PDEs, Cohen, DeVore, Acta Numerica 2015 Model reduction via proper orthogonal decomposition, Pinnau, in Model Order Reduction: Theory, Research Aspects and Applications, Springer 2008 Nonlinear model reduction via discrete empirical interpolation, Chaturantabut, Sorensen, SISC 2010 Class notebook
9	Tue, Apr 06	Many interpretations of kernels ESL, sec 14.5.4 Kernel techniques: From machine learning to meshless methods, Schaback and Wendland, Acta Numerica 2006 Gaussian Processes for Machine Learning, Rasumussen and Williams, 2006 - read Ch 1 Kernel Methods in ML, Hoffman, Scholkopf, Smola, Annals of Statistics, 2008 Spline Models for Observational Data, Wahba, SIAM 1990 - read the foreword in particular
	Thu, Apr 08	Approaches to kernel selection Spline Models for Observational Data, Wahba, SIAM 1990 - Ch 4 Gaussian Processes for Machine Learning, Rasumussen and Williams, 2006 - read Ch 5 Automatic Model Construction with Gaussian Processes, Duvenaud, Cambridge PhD dissertation, 2014 - read Ch 2 Class notebook
10	Tue, Apr 13	Computing with kernels Spline Models for Observational Data, Wahba, SIAM 1990 - Ch 11 Gaussian Processes for Machine Learning, Rasumussen and Williams, 2006 - read Ch 8
	Thu, Apr 15	Scalable kernel methods Kernel Interpolation for Scalable Structured GPs, Wilson and Nickisch, ICML 2015 Scalable Log Determinants for GP Kernel Learning, Eriksson et al, NeurIPS 2017 Scaling GP Regression with Derivatives, Dong et al, NeurIPS 2018 Exact GPs on a Million Data Points, Wang et al, NeurIPS 2019 Fast estimation of tr(f(A)) via stochastic Lanczos quadrature, Ubaru, Chen, and Saad, SIMAX 2017 Meeting notes
11	Tue, Apr 20	Matrices associated with graphs Graph Algorithms in the Language of Linear Algebra, Kepner and Gilbert, eds, SIAM 2011 - Ch 1-2 Graph spectral techniques in computer sciences, Arsic et al, Appl Anal Discrete Math, 2012 Mining Large Graphs, Gleich and Mahoney, Handbook of modern statistical methods, 2016
	Thu, Apr 22	Function approximation on graphs Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions, Zhu, Gharhraman, and Lafferty, ICML 2003 Learning with Local and Global Consistency, Zhou, NeurIPS 2004 Empirical stationary correlations for semi-supervised learning on graphs, Xu, Dyer, and Owen, Ann Appl Stat, 2010 Using Local Spectral Methods to Robustify Graph-Based Learning Algorithms, Gleich and Mahoney, KDD 2015
12	Tue, Apr 27	Graph clustering and partitioning A tutorial on spectral clustering, von Luxburg, Statistics and Computing 2007 Communities in networks, Porter, Onnela, and Mucha, Notices of the AMS, 2009 Community detection in networks: A user guide, Fortunato and Hric, Physics Reports, 2016 Trace optimization and eigenproblems in dimension reduction methods, Kokiopoulou, Chen, and Saad, NLAA, 2011
	Thu, Apr 29	Centrality measures On the Limiting Behavior of Parameter-Dependent Network Centrality Measures, Benzi and Klymko, SIMAX 2015 PageRank Beyond the Web, Gleich, SIREV 2015 Class notebook Class notebook PDF
13	Tue, May 04	Learning linear system dynamics Dynamic mode decomposition of numerical and experimental data, Schmid, JFM 2010 System Identification, Ljung, 2007
	Thu, May 06	Learned dynamics and extrapolation Data-Driven Science and Engineering, Brunton and Kutz, 2019 – reach Ch 7 Class notebook
14	Tue, May 11	Koopman theory and lifting Dynamic mode decomposition with control, Proctor, Brunton, and Kutz, SIAM Applied Dyn Sys 2016 An eigensystem realization algorithm for modal parameter identification and model reduction, Juang and Pappa, J. Guidance 1985 Hamiltonian Systems and Transformations in Hilbert Space, Koopman, PNAS 1931
	Thu, May 13	Learning nonlinear dynamics Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Brunton, Proctor, Kutz, PNAS 2016 A Data-Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition, Williams, Kevrekidis, Rowley, J Nonlinear Science 2015 A Kernel-Based Method for Data-Driven Koopman Spectral Analysis, Williams, Rowley, Kevrekidis, J Comp Dynamics 2015 Class notebook