|
CS6784 is an advanced machine learning
course for students that have already taken CS 4780 or CS 6780 or an
equivalent machine learning class, giving in-depth coverage of currently
active research areas in machine learning. The course will connect to
open research questions in machine learning, giving starting points for
future work. In particular, the course will focus on recent work in the
following areas:
- Structured Output Prediction: In conventional
classification and regression, the prediction is a single number. Many
application problems, however, require the prediction of complex
multi-part objects like trees (e.g. natural language parsing),
alignments (e.g. protein threading), rankings (e.g. search engines),
and paths (e.g. navigation assistant). How can one tractably model and
learn to make such complex predictions?
- Humans in the Loop: Much of the data used for machine
learning is gathered by observing human behavior (e.g. search engine
logs, purchase data, fraud detection). However, it is known that this
data is biased (e.g. users can click only on results that were
presented). How can one learn despite these biases? Or how can the
learning algorithm gather unbiased data by not being a passive
observer, but by actively interacting with the human?
- Understanding Archives: We are capturing and archiving more
and more data (e.g. email, blogs, photos). While search engines give
good microscopic access to individual data item, much work is needed
to get a more macroscopic view of the content of an archive. How can
machine learning help understand and summarize content, trends,
dependencies, and idea flows in such archives?
The content of the course will reflect a balance of learning methods,
algorithms, and their theoretical understanding, putting an emphasis on
approaches with practical relevance. |
|
Structured Output Prediction
- 02/09: I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun,
Support Vector Machine Learning for Interdependent and Structured
Output Spaces, ICML, 2004. (paper)
- 02/11: Chun-Nam John Yu, T. Joachims, R. Elber, J. Pillardy. Support
Vector Training of Protein Alignment Models. Journal of
Computational Biology, 15(7): 867-880, September 2008. (paper)
- 02/16: Ben Taskar, Carlos Guestrin and Daphne Koller. Max-Margin Markov
Networks. NIPS, 2004. (paper)
[Lu] (30 min)
- 02/18: D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G.
Heitz, A. Ng. Discriminative Learning of Markov Random Fields for
Segmentation of 3D Scan Data. CVPR, 2005. (paper)
[Sarah] (20 min)
- 02/18: J. Weston, O. Chapelle, A. Elisseeff, B. Schoelkopf and V.
Vapnik, Kernel Dependency Estimation, NIPS, 2002. (paper)
[Alex] (20 min)
- 02/23: Andrew McCallum, Dayne Freitag, and Fernando Pereira. Maximum
entropy Markov models for information extraction and segmentation. ICML, 2000. (paper)
[Ruogu] (20 min)
- 02/23: John Lafferty, Andrew McCallum, Fernando Pereira, Conditional
Random Fields: Probabilistic Models for Segmenting and Labeling
Sequence Data. ICML, 2001. (paper)
[Guozhang,Rohit] (20 min)
- 02/25: Ulf Brefeld, Tobias Scheffer, Semi-Supervised Learning for
Structured Output Variables, ICML, 2006. (paper)
[Jean-Baptiste] (20 min)
- 03/02: Linli Xu, Dana Wilkinson, Finnegan Southey, Dale Schuurmans.
Discriminative Unsupervised Learning of Structured Predictors. ICML,
2006. (paper) [Kent,Mark]
(20 min)
- 03/02: Brooke Cowan, Ivona Kucerova, and Michael Collins, A
Discriminative Model for Tree-to-Tree Translation, EMNLP 2006. (paper)
[Martin,Ruben] (20 min)
- 03/04: Yisong Yue, T. Finley, F. Radlinski, T. Joachims. A Support
Vector Method for Optimizing Average Precision. SIGIR, 2007. (paper)
- 03/04: Yisong Yue, T. Joachims. Predicting Diverse Subsets Using
Structural SVMs. ICML, 2008. (paper)
- 03/09: Matthew Blaschko, Christoph Lampert. Learning to Localize
Objects with Structured Output Regression. ECCV, 2008. (paper)
[Adarsh,Yimeng] (20 min)
- 03/09: Pieter Abbeel and Andrew Y. Ng., Apprenticeship Learning via
Inverse Reinforcement Learning, ICML, 2004. (paper)
[Vasu,Dane] (20 min)
- 03/11: Hal Daume, J. Langford, and Daniel Marcu, Search-based
Structured Prediction, Machine Learning, 2009. (paper)
[Michael,Sudip] (45 min)
- 03/16: Matthew Richardson, Pedro Domingos, Markov Logic Networks,
Machine Learning, Vol. 62, Number 1-2, pp. 107-136, 2006. (paper)
[Yue,Joel] (45 min)
Learning with Humans in the Loop
- 03/30: T. Joachims, L. Granka, Bing Pan, H. Hembrooke, F. Radlinski, G.
Gay. Evaluating the Accuracy of Implicit Feedback from Clicks and
Query Reformulations in Web Search, ACM Transactions on Information
Systems (TOIS), Vol. 25, No. 2 (April), 2007. (paper)
- 04/01: Ben Carterette, Rosie Jones. Evaluating Search Engines by
Modeling the Relationship Between Relevance and Clicks. NIPS, 2007.
(paper) [CongCong]
(20 min)
- 04/01: F. Radlinski, M. Kurup, T. Joachims. How Does Clickthrough Data
Reflect Retrieval Quality? CIKM, 2008. (paper)
- 04/06: O. Chapelle and Y. Zhang. A dynamic Bayesian network click model
for web search ranking. WWW Conference, 2009. (paper)
[Michaela,Vikram] (20 min)
- 04/06: E. Agichtein, E. Brill, S. T. Dumais and R. Ragno. Learning user
interaction models for predicting web search preferences. SIGIR,
2006. (paper) [Christie,Jacob]
(20 min)
- 04/08: D. Beeferman, A. Berger. Agglomerative clustering of search
engine query logs. KDD, 2000. (paper)
[Cangmin,Ronan] (20 min)
- 04/08: John Langford, Alexander Strehl, and Jennifer Wortman.
Exploration Scavenging, ICML, 2008. (paper)
[Nikos,Devin] (20 min)
- 04/13: Yisong Yue, J. Broder, R. Kleinberg, T. Joachims. The K-armed
Dueling Bandits Problem. [COLT, 2009], preprint of journal version. (paper)
Understanding Archives
- 04/20: S. Pohl, F. Radlinski, T. Joachims. Recommending Related Papers
Based on Digital Library Access Records. JCDL, 2007. (paper)
- 04/20: S. Knoll, A. Hoff, D. Fischer, S. Dumais and E. Cutrell (2009).
Viewing personal data over time. In Proceedings of CHI 2009 Workshop
on Interacting with Temporal Data. AND also using the references
therein. (paper)
[Jimmy] (20 min)
- 04/22: B. Shaparenko, T. Joachims, Information Genealogy:
Uncovering the Flow of Ideas in Non-Hyperlinked Document Databases,
KDD), 2007. (paper)
- 04/27: D. Blei, A. Ng, M. Jordan. Latent Dirichlet Allocation. Journal
of Machine Learning Research (JMLR), 3(5):993–1022, 2003. (paper)
[Zhaoyin,Ainur] (45 min)
- 04/29: J. Kleinberg. Bursty and Hierarchical Structure in Streams. KDD,
2002. (paper)
[Amir] (20 min)
|
|
- T. Mitchell, "Machine Learning", McGraw Hill, 1997.
- B. Schoelkopf, A. Smola, "Learning with Kernels", MIT Press, 2001.
(online)
- C. Bishop, “Pattern Recognition and Machine Learning”, Springer,
2006.
- R. Duda, P. Hart, D. Stork, "Pattern Classification“, Wiley, 2001.
- T. Hastie, R. Tishirani, and J. Friedman, "The Elements of
Statistical Learning“, Springer, 2001.
- N. Cristianini, J. Shawe-Taylor, "Introduction to Support Vector
Machines", Cambridge University Press, 2000. (online)
- C. Manning, H. Schuetze, "Foundations of Statistical Natural
Language Processing", MIT Press, 1999. (online)
- E. Alpaydin, "Introduction to Machine Learning", MIT Press, 2004.
|