Background References on Machine Learning
Ø
Author: Tom Mitchell
Title: "Machine Learning"
Publisher: McGraw Hill, 1997
Ø Author: Ethem Alpaydin
Title: "Introduction to Machine Learning"
Publisher: MIT
Press, 2004
Ø
Authors: Christopher Manning and Hinrich Schutze
Title:
"Foundations of Statistical NLP"
Publisher:
MIT Press, 1999
Ø
Authors: R. Duda, P. Hart, D. Stork
Title: "Pattern Classification"
Publisher: Wiley, 2001
Ø
Authors: T. Hastie, R. Tishirani, and J. Friedman
Title: "The Elements of Statistical Learning"
Publisher: Springer, 2001
Ø Author:
B. Schoelkopf and A. Smola
Title: "Learning with
Kernels"
Publisher: MIT Press, 2001
Overviews and Tutorials
Ø
Author: Thomas G. Dietterich
Title:
Machine Learning for Sequential Data: A
Review
Journal:: In Structural, Syntactic, and Statistical Pattern
Recognition; Lecture Notes in Computer Science, Vol. 2396, T. Caelli
(Ed.), pp. 15–30
Publisher:
Springer-Verlag, 2002
Ø
Author: Ben Taskar
Title: Max-Margin Methods for NLP: Estimation,
Structure, and Applications
Proceedings: Tutorial at the Conference of
the Association for Computational Linguistics (ACL05),
Ann Arbor, MI, June 2005
Ø
Author: Thorsten Joachims
Title:
Learning to Predict Trees, Sequences, and
other Structured Outputs
Proceedings: Tutorial at NESCAI, 2006
Ø
Author: H. Wallach
Title:
Conditional Random Fields: An Introduction.
Predicting Sequences
Ø
***Authors: Andrew McCallum, Dayne
Freitag, and Fernando Pereira
Title:
Maximum entropy Markov models for information
extraction and segmentation
Proceedings: International Conference on Machine Learning (ICML 2000),
pages 591-598, Stanford, California, 2000
Ø Authors: Yasemin Altun, Thomas Hofmann,
Mark Johnson
Title:
Loss Functions
and Optimization Methods for Discriminative Learning of Label Sequences
Proceedings:
Empirical Methods in Natural Language Processing (EMNLP), 2003
Ø Authors: Yasemin Altun, Thomas Hofmann &
Mark Johnson
Title: Discriminative
Learning for Label Sequences via Boosting
Proceedings:
Advances in Neural Information Processing Systems (NIPS*15), 2003
Ø Authors: Jospeh Bockhorst and Mark
Craven
Title:
Markov Networks for Detecting Overlapping
Elements in Sequence Data.
Proceedings: Advances in Neural
Information Processing Systems 17 (NIPS 2004), 2005.
Neural Networks
Ø ***Authors:
Y. LeCun, L.
Bottou, Y. Bengio, and P. Haffner
Title: Gradient-based learning applied to
document recognition.
Journal:
IEEE,
86(11):2278{2324} 1998
Ø
***Authors: LeCun and Huang
Title: Loss
Functions for Discriminative Training of Energy-Based Models
Proceedings: AI-Stats, 2005
Structural SVMs and Maximum Margin Markov
Networks
Ø
Authors: Yasemin Altun and Thomas
Hofmann
Title:
Large Margin Methods for Label Sequence
Learning.
Proceedings: 8th European Conference on
Speech Communication and Technology (EuroSpeech), 2003.
Ø Authors: Yasemin
Altun, Ioannis Tsochantaridis and Thomas Hofmann Title: Hidden Markov Support Vector Machines
Proceedings: 20th International
Conference on Machine Learning (ICML), 2003
Ø
Author: T. Joachims
Title:
Learning to Align Sequences: A Maximum-Margin
Approach
Technical Report, 2003
Ø
***Author: Ben Taskar, Carlos Guestrin
and Daphne Koller
Title:
Max-Margin Markov Networks.
Proceedings:
In
Advances in Neural Information Processing Systems 16 (NIPS 2003),
2004.
Ø
Authors: Simon Lacoste-Julien
Title:
Combining SVM with graphical models for
supervised classification: an introduction to Max-Margin Markov Networks
CS281A Project Report, UC Berkeley, 2003
Ø Authors: Tsochantaridis, T.
Hofmann, T. Joachims, and Y. Altun
Title:
Support Vector Machine Learning for
Interdependent & Structured Output Spaces,
Proceedings: International Conference on Machine Learning (ICML), 2004.
Ø
*** Authors: Ben Taskar, Dan Klein,
Michael Collins, Daphne Koller, and Christopher Manning
Title:
Max-Margin Parsing.
Proceedings:
EMNLP 2004.
Ø
Author: Ben Taskar
Title:
Learning Structured Prediction Models: A Large
Margin Approach.
Publisher:
Stanford University, CA, December 2004.
Ø Author: Peter Bartlett, Michael Collins,
Ben Taskar, and David McAllester
Title: Exponentiated
gradient algorithms for large-margin structured classification.
Proceedings:
NIPS04, 2005.
Ø Authors: T. Finley and T.
Joachims
Title:
Supervised Clustering with Support Vector
Machines,
Proceedings:
International Conference on
Machine Learning (ICML), 2005.
Ø
Authors: T. Joachims, T.
Galor, and R. Elber
Titles:
Learning to Align Sequences: A Maximum-Margin
Approach,
Book: New
Algorithms for Macromolecular Simulation,
B. Leimkuhler, LNCS Vol. 49,
Publisher: Springer, 2005.
Ø
Authors: I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun
Title:
Large Margin Methods for Structured and
Interdependent Output Variables,
Journal: Journal of Machine Learning Research (JMLR),
6(Sep):1453-1484, 2005.
Ø
Author: B. Taskar, S. Lacoste-Julien, and D. Klein,
Title:
A Discriminative Matching Approach to Word
Alignment,
Proceedings:
Empirical Methods in
Natural Language Processing (EMNLP05),
Ø Author: B. Taskar, V. Chatalbashev, D. Koller and C. Guestrin.
Title:
Learning Structured Prediction Models: A Large
Margin Approach.
Proceedings:
International Conference on Machine Learning (ICML05), 2005.
Ø
***Author: D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng.
Title:
Discriminative Learning of Markov Random
Fields for Segmentation of 3D Scan Data.
Proceedings: International Conference on
Computer Vision and Pattern Recognition (CVPR05),
San Diego, CA, June 2005.
Ø Authors: Ryan McDonald, Koby Crammer, and
Fernando Pereira
Title:
Online large-margin training of dependency
parsers.
Proceedings: Annual Meeting of the Association for
Computational Linguistics (ACL 2005), July 2005.
Ø Author: B. Taskar, S. Lacoste-Julien, and M. Jordan,
Title: Structured Prediction via the Extragradient
Method,
Proceedings:
Neural Information
Processing Systems Conference (NIPS05),
Vancouver, British Columbia, 2006. [Longer
version]
Perceptron and Reranking
Ø Authors: Brian Roark, Murat Saraclar,
Michael Collins, and Mark Johnson
Title:
Discriminative language modeling with
conditional random fields and the perceptron algorithm.
Proceedings:
ACL
2004.
Ø
*** Author: Michael Collins
Title:
Discriminative Training Methods for Hidden
Markov Models: Theory and Experiments with Perceptron Algorithms.
Proceedings:
EMNLP 2002.
(This paper includes theorems and proofs which apply to the algorithms
in the ACL 2002 papers.)
Ø Authors: Michael Collins and Terry Koo
Title:
Discriminative Reranking for Natural Language
Parsing. (gzipped
version)
Journal: Computational Linguistics 31(1):25-69.
Ø Authors: Michael Collins and Nigel Duffy
Title:
New Ranking Algorithms for Parsing and
Tagging: Kernels over Discrete Structures, and the Voted Perceptron
Proceedings:
ACL 2002.
Ø
Author: Michael Collins
Title:
Ranking Algorithms for Named-Entity
Extraction: Boosting and the Voted Perceptron.
Proceedings: ACL 2002.
Ø Author: Michael Collins
Title: Discriminative Reranking for Natural Language
Parsing.
Proceedings:
ICML 2000.
Gaussian Process
Ø
Authors: Yasemin Altun, Thomas Hofmann
and Alexander J. Smola
Title:
Gaussian process classification for segmenting
and annotating sequences.
Proceedings: International Conference on Machine Learning (ICML
2004), 2004.
Conditional Random Fields
Ø
***Authors: John Lafferty, Andrew
McCallum, Fernando Pereira
Title:
Conditional Random Fields: Probabilistic
Models for Segmenting and Labeling Sequence Data.
Proceedings:
International Conference on Machine
Learning (ICML-2001), 2001.
Ø
Authors: Fei Sha and Fernando Pereira
Title:
Shallow Parsing with Conditional Random
Fields.
Proceedings: Human Language
Technology Conference and North American Chapter of the Association for
Computational Linguistics (HLT/NAACL-03), 2003.
Ø Author: Andrew McCallum
Title: Efficiently Inducing Features of Conditional
Random Fields.
Proceedings: Conference in
Uncertainty in Articifical Intelligence (UAI-2003), 2003.
Ø Authors: David Pinto, Andrew McCallum,
Xing Wei and W. Bruce Croft
Title:
Table Extraction Using Conditional Random
Fields.
Proceedings: Annual International
ACM SIGIR Conference on Research and Development in Information
Retrieval (SIGIR 2003), 2003.
Ø Authors: Andrew McCallum and Wei Li
Title:
Early Results for Named Entity Recognition
with Conditional Random Fields, Feature Induction and Web-Enhanced
Lexicons.
Proceedings: Conference on
Natural Language Learning (CoNLL), 2003.
Ø Authors: Andrew McCallum, Khashayar
Rohanimanesh and Charles Sutton
Title:
Dynamic Conditional Random Fields for Jointly
Labeling Multiple Sequences.
Proceedings: Workshop on Syntax, Semantics,
Statistics; 16th Annual Conference on Neural Information Processing
Systems (NIPS 2003), 2004.
Ø Authors: Kevin Murphy, Antonio Torralba
and William T.F. Freeman
Titles:
Using the forest to see the trees: a graphical
model relating features, objects and scenes.
Proceedings: Advances
in Neural Information Processing Systems 16 (NIPS 2003), 2004.
Ø
Authors: Sanjiv Kumar and Martial Hebert
Title:
Discriminative Fields for Modeling Spatial
Dependencies in Natural Images.
Proceedings: Advances in Neural
Information Processing Systems 16 (NIPS 2003), 2004.
Ø
Author: Burr Settles
Title: Biomedical Named Entity Recognition Using
Conditional Random Fields and Rich Feature Sets.
Proceedings: International Joint Workshop on Natural Language
Processing in Biomedicine and its Applications (NLPBA), 2004.
Ø Authors: Charles Sutton, Khashayar
Rohanimanesh and Andrew McCallum
Title:
Dynamic Conditional Random Fields: Factorized
Probabilistic Models for Labeling and Segmenting Sequence Data.
Proceedings: International Conference on
Machine Learning (ICML 2004), 2004.
Ø Authors: John Lafferty, Xiaojin Zhu and
Yan Liu
Title:
Kernel conditional random fields:
representation and clique selection.
Proceedings: International Conference on Machine Learning (ICML
2004), 2004.
Ø Authors: Xuming He, Richard Zemel, and
Miguel Á. Carreira-Perpiñán
Title:
Multiscale conditional random fields for image
labelling.
Proceedings: IEEE Computer
Society Conference on Computer Vision and Pattern Recognition (CVPR
2004), 2004.
Ø Author: Yasemin Altun, Alex
J. Smola, Thomas Hofmann
Title:
Exponential Families for Conditional Random
Fields.
Proceedings: Conference on
Uncertainty in Artificial Intelligence (UAI-2004), 2004.
Ø Authors: Michelle L. Gregory and Yasemin
Altun
Title:
Using Conditional Random Fields to Predict
Pitch Accents in Conversational Speech.
Proceedings: Annual Meeting of the Association for Computational
Linguistics (ACL 2004), 2004.
Ø
Author: Brian Roark, Murat Saraclar,
Michael Collins and Mark Johnson
Title:
Discriminative Language Modeling with
Conditional Random Fields and the Perceptron Algorithm.
Proceedings:
Annual Meeting of the Association for
Computational Linguistics (ACL 2004), 2004.
Ø Author: Trausti T. Kristjansson, Aron
Culotta, Paul Viola and Andrew McCallum
Title: Interactive Information Extraction with
Constrained Conditional Random Fields.
Proceedings: National Conference on Artificial Intelligence (AAAI
2004), 2004.
Ø Author: Thomas G. Dietterich, Adam
Ashenfelter and Yaroslav Bulatov
Title:
Training Conditional Random Fields via
Gradient Tree Boosting.
Proceedings: International Conference on Machine Learning
(ICML 2004), 2004.
Ø
***Author: Fuchun Peng and Andrew
McCallum
Title:
Accurate Information Extraction from Research
Papers using Conditional Random Fields.
Proceedings: Human Language Technology Conference and North American Chapter of the
Association for Computational Linguistics (HLT/NAACL-04), 2004.
Ø
Author: Ariadna Quattoni, Michael
Collins and Trevor Darrel
Title: Conditional Random Fields for Object
Recognition.
Proceedings: Advances in Neural Information Processing
Systems 17 (NIPS 2004), 2005.
Ø
Authors: Yuan Qi, Martin Szummer and
Thomas P. Minka
Title: Bayesian Conditional Random Fields.
Proceedings: International Workshop on
Artificial Intelligence and Statistics (AISTATS 2005), 2005.
Ø
Authors: Aron Culotta, David Kulp and
Andrew McCallum
Title:
Gene Prediction with Conditional Random
Fields.
Technical Report UM-CS-2005-028. University of
Massachusetts, Amherst, 2005.
Ø
Authors: Yang Wang and Qiang Ji
Title:
A Dynamic Conditional Random Field Model for
Object Segmentation in Image Sequences.
Proceedings: IEEE
Computer Society Conference on Computer Vision and Pattern Recognition
(CVPR 2005), Volume 1, 2005.
Ø Author: Ryan McDonald and Fernando
Pereira
Title:
Identifying gene and protein mentions in text
using conditional random fields
Journal: BMC Bioinformatics,
6(Suppl 1):S6, May 2005.
Ø Author:
Yejin Choi, Claire Cardie, Ellen Riloff, and Siddharth Patwardhan
Title:
Identifying
Sources of Opinions with Conditional Random Fields and Extraction Patterns.
Proceedings:
HLT-EMNLP, 2005.
Ø Authors: S.V.N. Vishwanathan, Nicol N.
Schraudolph, Mark W. Schmidt, Kevin P. Murphy,
Title: Accelerated Training of
Conditional Random Fields with Stochastic Gradient Methods,
Proceedings: ICML, 2006.
Post-Processing to Enforce Dependencies
Ø
***Authors: Punyakanok, V., Roth, D.,
Yih, W., & Zimak, D. (2004).
Title:
Semantic role labeling via integer linear
programming inference.
Proceedings: COLING-2004.
Ø
Authors: D. Roth and
W. Yih
Title: Integer Linear
Programming Inference for Conditional Random Fields.
Proceedings: International Conference on
Machine Learning (ICML) (2005) pp.
737--744
Ø
Authors: V.
Punyakanok, D. Roth, W. Yih, and D. Zimak
Title:
Learning and Inference
over Constrained Output.
Proceedings:
International Joint Conference on Artificial Intelligence
(IJCAI) (2005) pp. 1124--1129
Learning a Search Heuristic
Ø
***Authors: Hal Daume and
Daniel Marcu
Title:
Learning as Search Optimization: Approximate
Large Margin Methods for Structured Prediction Proceedings:
International Conference on Machine Learning (ICML), 2005.
Ø
Authors: Hal Daume, John Langford, and
Daniel Marcu
Title: Search-based Structured
Prediction
Unpublished.
Kernel Dependency Estimation
Ø
***Authors: J. Weston, O. Chapelle, A. Elisseeff, B.
Schoelkopf and V. Vapnik Title: "Kernel
Dependency Estimation" Proceedings: NIPS 2002.
Ø
Authors: G. Bakir, J. Weston and B. Schölkopf Title: "Learning
to find Pre-Images" Proceedings: NIPS, 2003
Ø
Authors: J. Kwok, I. Tsang Title: "The
Pre-Image Problem in Kernel Methods" Journal: IEEE Transactions on Neural
Networks, Vol. 15, No. 6, 1517 - 1525, 2004.
Ø
Authors: J. Weston, B. Schoelkopf and O. Bousquet Title: "Joint
Kernel Maps" Proceedings: From the 8th International Work-Conference
on Artificial Neural Networks, LNCS 3512, 176-191.
Ø
Authors: Corinna Cortes, Mehryar Mohri and Jason Weston Title:
"A General
Regression Technique for Learning Transductions" Proceedings: ICML, 2005
Ø
Author: G. Bakir Title: "Extensions
to Kernel Dependency Estimation" Thesis, 2006.
Semi-Supervised Learning and Transduction
Ø
Authors: John Lafferty, Yan Liu and
Xiaojin Zhu.
Title:
Kernel Conditional Random Fields:
Representation, Clique Selection, and Semi-Supervised Learning.
Journal:
Technical Report CMU-CS-04-115, Carnegie Mellon University, 2004.
Ø Author: Terry Koo and Michael Collins
Title:
Hidden-Variable Models for Discriminative
Reranking.
Proceedings: EMNLP 2005.
Ø ***Authors: Ulf Brefeld, Tobias
Scheffer.
Title:
Semi-Supervised Learning for Structured Output Variables,
Proceedings:
ICML, 2006.
Ø
***Authors: Linli Xu, Dana Wilkinson,
Finnegan Southey, Dale Schuurmans
Title:
Discriminative Unsupervised Learning
of Structured Predictors
Proceedings:
ICML, 2006.
Reinforcement Learning
Ø
***Authors: Pieter Abbeel and Andrew Y. Ng.
Title:
Apprenticeship learning via
inverse reinforcement learning
Proceedings: International Conference on Machine
Learning, 2004.
Ø Authors: Nathan D. Ratliff, J. Andrew
Bagnell, Martin A. Zinkevich,
Title:
Maximum Margin Planning,
Proceedings:
ICML, 2006.
Loss Functions
Ø
Authors: Lijuan Cai, Thomas Hofmann,
Title:
Hierarchical Document Categorization with Support Vector Machines,
Proceedings:
ACM
13th Conference on Information and Knowledge Management, 2004.
Ø Author: T. Joachims
Title:
A Support Vector Method for Multivariate
Performance Measures,
Proceedings: International
Conference on Machine Learning (ICML), 2005.
Ranking
Ø Authors: William W. Cohen, Rob Schapire,
Yoram Singer
Title:
Learning to Order Things in
J. Artif. Intell. Res. (JAIR) 10: 243-270
(1999). (Originally published as: William W. Cohen, Robert E.
Schapire, Yoram Singer (1997): Learning to Order Things
Proceedings:
NIPS 1997.
Ø Authors: Ralf Herbrich, Thore Graepel,
Klaus Obermayer
Title:
Support Vector Learning for Ordinal
Regression
Proceedings: International
Conference on Artificial Neural Networks 97--102
Ø Authors: Ralf Herbrich, Thore Graepel,
Klaus Obermayer
Title:
Large Margin Rank Boundaries for Ordinal
Regression
Book: Advances in Large Margin Classifiers 115--132, 2000.
Ø
Authors: Koby
Crammer and Yoram Singer,
Title:
PRanking with Ranking
Proceedings: Neural Information Processing Systems (NIPS), 2001.
Ø Author: T. Joachims
Title:
Optimizing Search Engines Using Clickthrough
Data,
Proceedings: ACM Conference on Knowledge
Discovery and Data Mining (KDD), ACM, 2002.
Ø Author: Yoav Freund,
Raj D. Iyer, Robert E. Schapire,
Yoram Singer:
Title:
An Efficient Boosting
Algorithm for Combining Preferences
Journal:
Journal of Machine Learning Research 4:
933-969 (2003).
Ø
***Authors: Marie desJardins, Eric
Eaton, Kiri L. Wagstaff,
Title:
Learning User Preferences for Sets of Objects,
Proceedings:
ICML, 2006.
Alignment and Translation
Ø
Author: T. Joachims
Title:
Learning to Align Sequences: A Maximum-Margin
Approach
Technical Report, 2003
Ø
Author: T. Joachims,
T. Galor, and
R. Elber
Title:
Learning to Align Sequences: A Maximum-Margin
Approach
Book: New Algorithms for Macromolecular Simulation, B. Leimkuhler, LNCS
Vol. 49, Springer, 2005.
Ø Authors: Luke S. Zettlemoyer and
Michael Collins
Title:
Learning to Map Sentences to Logical Form:
Structured Classification with Probabilistic Categorial Grammars.
Proceedings: UAI 2005.
Ø
***Authors: Brooke Cowan, Ivona Kucerova,
and Michael Collins
Title: A Discriminative Model for Tree-to-Tree
Translation.
Proceedings: EMNLP 2006.
Ø
***Author:
P. Liang, Alexandre Bouchard-Cote, D. Klein and B. Taskar
Title: An End-to-End Discriminative Approach to
Machine Translation,
Proceedings:
Association
for Computational Linguistics (ACL06),
Sydney, Australia, July 2006.
Ø
***Author: P. Liang, B. Taskar, and D. Klein.
Title:
Alignment by Agreement,
Proceedings:
Human Language Technology
conference and North American chapter of the Association for Computational
Linguistics (HLT-NAACL06),
New York, June 2006.
Ø
***Authors: Andrew McCallum, Kedar Bellare, and Fernando
Pereira
Title:
A conditional random field for discriminatively-trained finite-state
string edit distance.
Proceedings: Conference on Uncertainty in Artificial Intelligence
(UAI 2005), July 2005.
Ø ***Authors: Chuong B. Do, Samuel S.
Gross, and Serafim Batzoglou
Title:
CONTRAlign: Discriminative Training for
Protein Sequence Alignment,
Proceedings:
RECOMB, 2006.
Ø Authors:
Chun-Nam Yu, T. Joachims and R. Elber
Title:
Training Protein Threading Models Using
Structural SVMs,
Proceedings:
ICML Workshop on Learning in Structured
Output Spaces, 2006. |