Advanced Language Technologies, Fall 2019

Tuesdays and Thursday 1:25-2:40, Stimson G01 (Zoom link available on request)

This course covers selected advanced topics in natural language processing (NLP) and/or information retrieval, with a conscious attempt to avoid topics covered by other Cornell courses. Hence:

Students seeking a general introduction to NLP should take CS 4740 ("Introduction to Natural Language Processing) or CS 4744 ("Computational Linguistics") instead.

Students interested purely in language simply as an application domain for machine learning should consider other courses instead: Significant portions of CS6740/IS6300 will be devoted to modeling language phenomena formally in ways that (to date) are not machine-learning oriented.

If you're looking for something other than lecture content and have javascript enabled, click on the appropriate tab above. The tabs may take a little time to come up.

Prerequisites, enrollment, related classes

Prerequisites All of the following: CS 2110 or equivalent programming experience; a course in artificial intelligence or any relevant subfield (e.g., machine learning, NLP, information retrieval, Cornell CS courses numbered 47xx or 67xx); proficiency with using machine learning tools (e.g., fluency with training a classifier and assessing its performance using cross-validation).

Enrollment Enrollment is open on Student Center to PhD and MS students (although those who do not meet the prerequisites should not take this class).

Other students interested in gaining permission to enroll: please contact Prof. Lee after lecture on Tuesday, September 3rd. (Before that date, I won't have enough information on the number of students to be able to make enrollment allowances.) Try to attend the first two lectures if you can, but if you are shopping other courses meeting at the same time, it's OK to miss one or both of the first two CS6740 lecture times. You will be be responsible for making up the material on your own, but some form of notes or slides will be posted.

Auditing is an option for those permitted to enroll: the only requirement is to sign up on Student Center for the "Audit" option as Grade Basis, and there is no coursework or attendance requirement to earn the audit credit. Students already actively engaged in thesis research should thus choose the "Audit" grade basis.

Remote attendance is possible; please contact me for a Zoom link (contact information listed on the "Administrative info" tab).

Related classes See Cornell's NLP course list

Likely topics

Formal models of language, parsing complexity: Tree-adjoining grammar, and perhaps also combinatory categorial grammar

Joshi, Aravind K., Leon S. Levy, and Masako Takahashi. Tree adjunct grammar. Journal of Computer and System Sciences 10(1):136–163 (1975).
Joshi, Aravind, , K. Vijay-Shanker and David Weir. 1991. The convergence of mildly context-sensitive grammar formalisms. In: Wasow, T., Sells, P. and Shieber, S. (eds.), Foundational Issues in Natural Language Processing. [Technical report version]
Kuhlmann, Marco, Giorgio Satta, Peter Jonnson. 2018. On the complexity of CCG parsing Computational Linguistics 44(3):447-482
Satta, Giorgio. 1994. Tree-adjoining grammar parsing and Boolean matrix multiplication Computational Linguistics20(2):173–191.
Vijay-Shankar K. and Aravind K. Joshi. 1985. Some computational properties of tree adjoining grammars. In ACL, pp. 82–93.

Dependency parsing: Eisner's algorithm, Maximum-spanning tree

Textbook readings: Eisenstein chapter 11 (Dependency Parsing); Jurafsky and Martin Chapter 13 (Dependency Parsing)
McDonald, Ryan, Fernando Pereira, Kiril Ribarov, Jan Hajič. 2005. Non-projective dependency parsing using spanning tree algorithms. In EMNLP, 523-530

Style

Brooke, Julian, Thamar Solorio, Moshe Koppel (editors). 2017. Workshop on Stylistic Variation
Niu, Tong and Mohit Bansal. 2018. Polite dialogue generation without parallel data 6:373–389.
Rao, Sudha, Joel Tetreault. 2018. Dear sir or madam, may i introduce the GYAFC dataset: Corpus, benchmarks and metrics for formality style transfer. In NAACL, pp. 129-140.

Implication

Deng, Lingjia, Janyce Wiebe, Yoonjung Choi. 2014. Joint inference and disambiguation of implicit sentiments via implicature constraints. In COLING, pp. 79–88.
Feng, Donghui and Eduard Hovy. Handling biographical questions with implicature. In EMNLP, pp. 596–603.
MacCartney, Bill, Christopher D. Manning. 2008. Modeling semantic containment and exclusion in natural language inference. In COLING, pp. 521–528.
Potts, Christopher. 2013. Conversational implicature: interacting with grammar. Ms., Stanford University. [slides]
Reiter, Ehud. 1990. The computational complexity of avoiding conversational implicatures. In ACL, pp. 97–104.

(De)constructing datasets

Reiter, Ehud. 2019. Do we encourage researchers to use inappropriate data sets? Blog post
Liu, Nelson F., Roy Schwartz, and Noah A. Smith. Inoculation by fine-tuning: A method for analyzing challenge datasets. In NAACL, pp. 2171–2179
Gururangan, Suchin, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, and Noah A. Smith. 2018. Annotation artifacts in natural language inference data. In NAACL.
Yatskar, Mark. 2019. A qualitative comparison of CoQA, Squad 2.0, and QuAC. In NAACL.
Data sheets, data statements.
Geva, Mor, Yoav Goldberg, Jonathan Berant. 2019. Are we modeling the task or the annotator? An investigation of annotator bias in natural language understanding datasets In EMNLP-IJCNLP. (Link is to arxiv version)

Evaluation

Cífka, Ondřej and Ondřej Bojar. 2018. Are BLEU and meaning representation in opposition?. In ACL, 1362-1371.
Covington, Michael A. 2010. Cutting the Gordian knot: The moving-average type–token ratio (MATTR). Journal of Quantitative Linguistics 17(2).
Daniela Gerz, Ivan Vulić, Anna Korhonen. 2019. Show some love to your n-grams: A bit of progress and stronger n-gram language modeling. In NAACL, 4113–4118
Liu, Chia-Wei, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, Joelle Pineau. 2016. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation, In EMNLP, pp. 2122-2132
Peyrard, Maxime, 2019. Studying summarization evaluation metrics in the appropriate scoring range. In ACL, pp. 5093-5100.

Administrative info

Course homepage http://www.cs.cornell.edu/courses/cs6740/2019fa. Main site for course info, assignments, readings, lecture references, etc.

CMS page https://cmsx.cs.cornell.edu. Site for submitting assignments, unless otherwise noted. You may find this graphically-oriented guide to common operations useful: see how to replace a prior submission (point 1), how to tell if CMS successfully received your files (point 2), how to form a group (point 4).

Office hours and contact info See Prof. Lee's homepage and scroll to the section on "Contact and availability info".

Coursework

One administrative-info fill-in assignment
Roughly 6 assignments (about 1 per topic) probably involving some implementation
Potentially some in-class presentations/discussion, possibly but not necessarily in conjuction with the assignments
Possibly a (in-class or take-home) preliminary exam depending on whether there seems to be a need for such assessment
Take-home final exam (see lecture schedule for due date)

Resources

Cornell's Passkey for your web browser: "If you find yourself on a web page that has access restrictions, click on the bookmarklet icon and you will be redirected to the Cornell Web log-in screen to check for your valid Cornell affiliation.
You will be automatically led to the page you were trying to read, this time recognized for your right to gain access to the library's licensed resources."
All ACL conferences, journals, workshops proceedings/volumes :: WWW proceedings :: ICWSM proceedings
ACL wiki of resources — corpora, datasets, tools, software, lexicons, organized by language
Books, surveys, and tutorials:
Jurafsky, Dan and James Martin, 2009: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3rd edition draft chapters and slides) :: Jacob Eisenstein, 2018: Introduction to Natural Language Processing (MIT Press website, book draft) :: Gatt, Albert, Emiel Krahmer. 2018: Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. J. Artificial Intelligence Research 61 :: Yoav Goldberg, 2017: Neural Network Methods for Natural Language Processing (access via Cornell, JAIR version) ::
Toolkits, alphabetically: CMU twitter tools (Java) :: GATE (Java) :: Gensim (Python) :: Illinois tools (Java?) :: Lingpipe (Java) :: Mallet (Java) :: OpenNLP (Java) :: NLTK (Python) :: SpaCy (Cython) :: Stanford tools (Java) :: CRAN NLP tools (R)
Pretrained word embeddings: a recent list
NLP at Cornell

Lectures

#1 Aug 29: Introduction

Assignments/announcements

Readings for next week's two lectures: (1) One or both of Lee, Lillian, Cristian Danescu Niculescu-Mizil, Nam Nguyen, Myle Ott 2007 syntactic-structure lecture notes or Lee, Lillian, Ellis Weng, and Andrew Owens All but section 1 of 2010 lecture notes (2) One or both of Lee, Lillian, Alex Chao and David Collins 2007 augmented-CFG lecture notes or Lee, Lillian, Jerzy Hausknecht and Kent Sutherland 2010 augmented-CFG lecture notes.

Class images, links and handouts

Lecture slides

#2 Sep 3: Motivation for Tree Adjoining Grammars: introduction to sentential structure

Assignments/announcements

Those wishing to enroll but need a PIN: please email Prof. Lee with your name and netID by noon on Thursday if you can (by Tuesday evening is preferable)

Class images, links and handouts

Handout
Lecture outline-ish notes (provided for students who are still "shopping")

Other references

Chapter 11-11.3.1 (draft) and Section 13.4.2 (draft) of Jurafsky and Martin (3 ed)

#3 Sep 5: CFGs and long-distance dependencies; tree substitution grammars as a way to lexicalize CFGs

Assignments/announcements

Everyone (including auditors and those not yet enrolled): please complete the CS 6740 "administrative matters" quiz on CMS, https://cmsx.cs.cornell.edu, deadline Mon Sept 9, 11:59pm. Enrollment permissions will be decided in part by the information furnished as quiz answers.
So, being on CMS does not mean you have been enrolled in the class!
If you don't see "CS 6740" when you log in to CMS or can't log in, please email Prof. Lee with your name and netID.
Reading for today: Sections 3-4.1 of Aravind K. Joshi and Yves Schabes. 1991. Tree-adjoining grammars and lexicalized grammars. University of Pennsylvania Department of Computer and Information Science, Technical Report No. MS-CIS-91-22.
We're reversing the order of presentation (as is done is Schabes' 1990 Ph.D. thesis, Mathematical and computational aspects of lexicalized grammars)
Reading for next week (don't get too hung up on the details):
- Section 2 of the aforementioned tree-adjoining grammars and lexicalized grammars
- Section 8 "Linguistic relevance" of Aravind K. Joshi and Yves Schabes, 1996, "Tree adjoining grammars", which is chapter 2 of Handbook of Formal Languages: Vol 3, Beyond Words, ed. G. Rozenberg and A. Salomaa. (link requires logging in with your Cornell NetID)
Tentative sketch of first "real" assignment, due sometime between Sep 19 and 24: spend X hours (where I will specify X) implementing a representation of tree-adjoining grammars, allowing one to specify a TAG (that is, you should not hard-code a specific TAG), and, given a partial derivation tree (which you'll need to represent) and an elementary tree, determine whether the elementary tree can legally be substituted into by/adjoined into the corresponding derived tree. Write a description of your ideas and any challenges you faced. Be prepared to discuss your efforts in class.
You may not arrive at a really functional implementation; I'm just looking for a good-faith effort.

Class images, links and handouts

Handout

#4 Sep 10: Tree grammars: tree substitution grammars and tree adjoining grammars

Assignments/announcements

Assignment 1 is due September 19 12:00 P.M. (noon), but you can continue resubmitting on CMS (Lillian will set up CMS by the night of September 11th) until noon Monday the 23rd. You should spend a minimum of 10 hours and a maximum of 13 hours coding by the September 19 deadline; you're not obligated to do any more coding after that. Along with a zip file of your code, submit an informal writeup (PDF) describing your design decisions.
We'll discuss our experiences together on the lecture of Sep 24th.
Please work by yourselves until the September 19th deadline; after that I'll open up some sort of discussion site to allow for collaboration.

Class images, links and handouts

Handout

#5 Sep 12: Tree adjunction

Assignments/announcements

No lecture Oct 3.

Class images, links and handouts

Handout

Lecture references

Christopher Culy, 1985. The complexity of the vocabulary of Bambara. Linguistics and Philosophy, 8:345-351.
Alexis Manaster-Ramer, 1983. The soft formal underbelly of theoretical syntax. Chicago Linguistic Society 19, 256--262.

Geoff Pullum, 1986. Footloose and context-free. Natural Language and Linguistic Theory 4, pp. 283--289. Reprinted in The Great Eskimo Vocabulary Hoax, U. of Chicago Press, 1991.
Stuart Shieber, 1985. Evidence against the context-freeness of natural language. Linguistics and Philosophy, 8:333-343.

#6 Sep 17: More linguistic modeling with TAGs: modeling feature constraints

Class images, links and handouts

Handout

References

Anne Abeillé and Yves Schabes, 1989. Parsing idioms in lexicalized TAGs. Fourth Conference of the European Chapter of the Association for Computational Linguistics (EACL '89).
K. Vijay-Shankar and Aravind K. Joshi, 1988. Feature structure based tree adjoining grammars. Proceedings of the 12th International Conference on Computational Linguistics (COLING '88).

#7 Sep 19: TAG parsing: intuitions

Assignments/announcements

Assignment 1 addendum: post to CampusWire (join code given in class) some short description of and/or motivation for your test cases for assignment 1. Optional but encouraged: mention any questions you have for me or your fellow students.
Reading for next time: chapter 2 of Yves Schabes' 1990 PhD thesis, Mathematical and computational aspects of lexicalized grammars (link should provide access through the Cornell library/ProQuest)

Class images, links and handouts

References

Ziqi Wang, Haotian Zhang, and Anoop Sarkar. 2015. A Python-based interface for wide coverage lexicalized tree-adjoining grammars. The Prague Bulletin of Mathematical Linguistics 103(1).

#8 Sep 24: Earley-style TAG parsing, part 2

Assignments/announcements

Reading for next lecture or two: Mark Steedman (draft of November 1, 1996), A Very Short Introduction to CCG.
Also, skim Steedman's 2018 lifetime achievement award address, The Lost Combinator, printed in Computational Linguistics 44(4). You may find section 6, "CCG in the age of deep learning", an interesting reflection

Class images, links and handouts

Handout

References

Papers from the TAG+ workshops ("International Workshop on Tree Adjoining Grammars and Related Formalisms"), 1990-2017 (and beyond?) can be found at https://www.aclweb.org/anthology/venues/tag/

#9 Sep 26: Intro to CCGs

Assignments/announcements

I am tentatively planning a small CCG-based assignment to be released either next Tuesday or next Thursday (on which, recall, there is no lecture). You would have a week to complete it once it is released.

Class images, links and handouts

Handout
Chung-hye Han, Anoop Sarkar. 2017. Coordination in TAG without the conjoin operation. 13th Workshop on Tree Adjoining Grammars and Related Formalisms.

References

Yoav Artzi, Nicholas FitzGerald and Luke Zettlemoyer. 2013. ACL Tutorial: Semantic parsing with combinatory categorial grammars.
Mark Steedman, 2000. The Syntactic Process. (Online access available through the Cornell library). ([positive review by Joakim Nivre], [negative review by Martin Jansche and Shravan Vasishth])

#10 Oct 1: More on CCGs: conjunctions, modification, question inversion

Assignments/announcements

CCG pencil-and-paper assignment coming out Thursday on CMS, https://cmsx.cs.cornell.edu(but no lecture then)

Class images, links and handouts

Handout

References

Stephen Clark's lectures 6-8 (slides and notes) for his co-taught 2015-2016 course, Introduction to Natural Language Syntax and Parsing
ACL wiki entry on CCGs
NLP-Progress: CCG (English)

#11 Oct 3: No class — CIS 20th anniversary celebration

#12 Oct 8: More on CCGs: idioms, the copy language, parsing

Assignments/announcements

Reminder: A2 due Friday at noon on CMS, https://cmsx.cs.cornell.edu

Class images, links and handouts

Handout

References

Bozşahin, Cem, and Arzu Burcu Güven. 2018. Paracompositionality, MWEs and argument substitution. Formal Grammar. Comes with code: https://github.com/bozsahin/ccglab.
Eigner, Fabienne Sophie. 2007. Section 2.1 of Combinatory Categorial Grammar contains a description of a CCG for the copy language. [slides (in German)]
Stephen Clark's 2015 lecture 8 notes and slides
Kuhlmann, Marco, Giorgio Satta. 2014. A New Parsing Algorithm for Combinatory Categorial Grammar . TACL 2: 405--418.
Kuhlmann, Marco, Giorgio Satta, Peter Jonnson. 2018. On the complexity of CCG parsing. Computational Linguistics 44(3):447-482.
K. Vijay-Shanker and David J. Weir. 1990. Polynomial time parsing of combinatory categorial grammars. ACL, 1-8.
K. Vijay-Shanker and David J. Weir. Parsing some constrained grammar formalisms. Computational Linguistics 19(4):591-636.

#13 Oct 10: Concluding discussion on syntactic (and a bit of semantic) modeling

Assignments/announcements

A2 due tomorrow at noon.
Reading for next time (light stuff for Fall Break ...):
- Alon Halevy, Fernando Pereira, Peter Norvig. 2009. The unreasonable effectiveness of data. IEEE Intelligent Systems 24:8--12. (The usage of the word "deep" reads ironically these days.)
- Reiter, Ehud. 2019. Do we encourage researchers to use inappropriate data sets? Blog post

Class images, links and handouts

Handout

References

Bangalore, Srinivas and Joshi, Aravind K. 1999. Supertagging: An approach to almost parsing. Computational Linguistics 25(2): 237--265.
Hewitt, John and Manning, Christopher D. 2019. A structural probe for finding syntax in word representations. NAACL, 4129--4138. [podcast]
Najoung Kim, Roma Patel, Adam Poliak, Patrick Xia, Alex Wang, Tom McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman and Ellie Pavlick. 2019. Probing What Different NLP Tasks Teach Machines about Function Word Comprehension. Proceedings of *SEM.
Joe Pater. 2019. Generative linguistics and neural networks at 60: Foundation, friction, and fusion. Language e31-e74.
For the "hit the nail on the head" origins: Dmitrij Dobrovol’skij and Elisabeth Piirainen. 2010. Idioms: Motivation and etymology. Yearbook of Phraseology 1(1):73-96.

Oct 15: No class — Fall Break

#14 Oct 17: The dataset landscape: today and how we got here.

Assignments/announcements

Reading for next week: Liu, Nelson F., Roy Schwartz, and Noah A. Smith. Inoculation by fine-tuning: A method for analyzing challenge datasets. NAACL, pp. 2171–2179
Tentative plan for third assignment: try out "inoculating by fine-tuning", perhaps in a domain of your own choice. Time span: a week or a week and a half after the assignment is formalized (probably next Thursday)

Class images, links and handouts

What (do other people think) is the current (as of 2018) state of NLP? Frontiers in Natural Language Processing Expert Responses
A proliferation of datasets ... and takedowns thereof: see slides 14-17 of Rogers, Anne, 2019. Word Embeddings: 6 years later.
Datasets that were BERTed ("solved"): from GLUE to SuperGLUE: Wang, Alex, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. Neurips 2019. [website]
Datasets that were "broken" - see intro of the Liu et al 2019 reading.
Back of that Frontiers listing again ...
Right or wrong, dataset releases can get you pretty far, in my experience.
Praise of the Cornell movie-review dataset in the Test-of-time award nomination
Flaws in the Cornell movie-review dataset noted in section 4.3.1 of Maas et al. ACL 2011
Bo Pang's 2018 test-of-time award talk for Thumbs up? Sentiment classification using machine learning techniques, EMNLP 2002.

#15 Oct 22: "Breaking" data/evaluation;

Class images, links and handouts

Handout

What is the purpose of data? One reason is evaluation, as the Penn Treebank paper said. I mention the PTB because in the "old days", it was in some sense the canonical dataset. Here's an example results table (EMNLP 2011):
Maybe we just need new data all the time? Jacob Eisenstein tweet, Jun 1 2015
Recall the landscape: new data introduced, then "solved" or "broken".
- There are now even papers about algorithms that are meant to withstand certain kinds of "breaks" in datasets, e.g., Robin Jia, Aditi Raghunathan, Kerem Göksel, Percy Liang, 2019. Certified robustness to adversarial word substitutions. EMNLP.
Demos for two NLP tasks (we can try to break the algorithm in class)
1. Sentiment analysis demo at AllenNLP. Task considered in the "Build it Break it" data, and so will probably be an option for A3, since there's "regular" training data and "challenge" test sets.
2. Textual Entailment demo at AllenNLP. A simple version of this task is considered in the "breaking" Levy et al. 2015 paper
An early example "break" of a set of methods (and hence perhaps their underlying data?): Omer Levy, Steffen Remus, Chris Biemann, Ido Dagan. 2015. Do supervised distributional methods really learn lexical inference relations?. NAACL, 970-976. Actually, let's do Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel Bowman, Noah A. Smith, Annotation Artifacts in Natural Language Inference Data.

References

Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2):313--330

#16 Oct 24: "Inoculation by fine-tuning: A method for analyzing challenge datasets" and assignment A3 prep

Assignments/announcements

A3 is ~~coming~~ here!

Class images, links and handouts

Handout
The "Build it Break it" data
Liu, Nelson F., Roy Schwartz, and Noah A. Smith. Inoculation by fine-tuning: A method for analyzing challenge datasets. NAACL, pp. 2171–2179

References

NLTK chapter 6 on learning to classify text. The book is Steven Bird, Ewan Klein, and Edward Loper (no date), Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit, for Python 3. [Python 2 version, 2009]
Kyle Gorman, Steven Bedrick. 2019. We need to talk about standard splits. ACL, 2786-2791.

#17 Oct 29: No class — LL traveling to Edinburgh

#18 Oct 31: No class — LL at Edinburgh

#19 Nov 5: Mandatory individual appointments

#20 Nov 7: In-class A3 presentations

#21 Nov 12: Performance on "new" data

Assignments/announcements

Next lecture replaced by Chris Potts talk; no meeting during usual lecture time.

Class images, links and handouts

Handout
Allyson Ettinger, Sudha Rao, Hal Daumé III, Emily M. Bender. 2017. Towards linguistically generalizable NLP systems: A workshop and shared task. First Workshop on Building Linguistically Generalizable NLP Systems, 1-10.
Hady Elsahar, Matthias Gallé. 2019. To annotate or not? Predicting performance drop under domain shift. EMNLP, 2163–2173. [medium post] [supplementary material]

References

Ziv Goldfeld course, ECE 6970: Statistical distances for modern machine learning, Fall 2019.
Daniel Kifer, Shai Ben-David, and Johannes Gehrke. 2004. Detecting change in data streams. VLDB, 180-191.
Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine Learning 27(1-2): 151-175.
Barbara Plank, Gertjan van Noord. 2011. Effective measures of domain similarity for parsing. ACL, 1566-1576.
Vincent Van Asch, Walter Daelemans. 2010. Using Domain Similarity for Performance Estimation. Workshop on Domain Adaptation for Natural Language Processing, 31-36.
Erheng Zhong, Wei Fan, Qiang Yang, Olivier Verscheure, Jiangtao Ren. 2010. Cross validation framework to choose amongst models and datasets for transfer learning. ECML-PKDD, 547-562. [pdf of just this paper, rather than the whole proceedings]

#22 Nov 14: Class replaced by Chris Pott's CS colloquium, 11:40am-12:40pm, Gates G01, Fair adversarial tasks for natural language understanding

Assignments/announcements

If you cannot attend the colloquium, please see the video, which can be accessed via NetID login here, and which should be posted by a few days after the talk.

Speaker abstract, and bio: It is common to hear that certain natural language processing (NLP) tasks have been "solved". These claims are often misconstrued as being about general human capabilities (e.g., to answer questions, to reason with language), but they are always actually about how systems performed on narrowly defined evaluations. Recently, adversarial testing methods have begun to expose just how narrow many of these successes are. This is extremely productive, but we should insist that these evaluations be *fair*. Has the model been shown data sufficient to support the kind of generalization we are asking of it? Unless we can say "yes" with complete certainty, we can't be sure whether a failed evaluation traces to a model limitation or a data limitation that no model could overcome. In this talk, I will present a formally precise, widely applicable notion of fairness in this sense. I will then apply these ideas to natural language inference by constructing challenging but provably fair artificial datasets and showing that standard neural models fail to generalize in the required ways; only task-specific models are able to achieve high performance, and even these models do not solve the task perfectly. I'll close with discussion of what properties I suspect general-purpose architectures will need to have to truly solve deep semantic tasks. (joint work with Atticus Geiger, Stanford Linguistics)
Bio: Christopher Potts is Professor of Linguistics and, by courtesy, of Computer Science, at Stanford, and Director of the Center for the Study of Language and Information (CSLI) at Stanford. In his research, he develops computational models of linguistic reasoning, emotional expression, and dialogue. He is the author of the 2005 book The Logic of Conventional Implicatures as well as numerous scholarly papers in linguistics and natural language processing.

#23 Nov 19: Evaluation by/of textual inference (aka entailment)

Assignments/announcements

Reading for next time (= next week Tuesday): Abend, Omri, and Ari Rappoport. 2017. The State of the Art in Semantic Representation. In ACL (Volume 1: Long Papers), 77–89.

Class images, links and handouts

Handout
The FraCaS (1996) textual-inference problems, converted to XML by Bill MacCartney
Manning, Christopher D. 2006. Local textual inference: it’s hard to circumscribe, but you know it when you see it - and NLP needs it.
Cristian Danescu-Niculescu-Mizil's slides for the "doubt" downward-monotonicity paper referenced below.

References

Bos, Johan. 2011. A Survey of Computational Semantics: Representation, Inference and Knowledge in Wide-Coverage Text Understanding. Language and Linguistics Compass 5 (6): 336–66. See section 6.
Crouch, Richard, Lauri Karttunen, and Annie Zaenen. 2006. Circumscribing Is Not Excluding: A Response to Manning.
Dagan, Ido, Oren Glickman, and Bernardo Magnini. 2006. The PASCAL Recognising Textual Entailment Challenge. In Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Textual Entailment, edited by Joaquin Quiñonero-Candela, Ido Dagan, Bernardo Magnini, and Florence d’Alché-Buc, 3944:177–90. Berlin, Heidelberg: Springer Berlin Heidelberg.
Cristian Danescu-Niculescu-Mizil, Lillian Lee, Rick Ducott. 2009. Without a ‘doubt’? Unsupervised discovery of downward-entailing operators. NAACL.
Zaenen, Annie, Lauri Karttunen, and Richard Crouch. 2005. Local textual inference: Can it be defined or circumscribed? In Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment d, 31–36. Ann Arbor, Michigan: Association for Computational Linguistics.

#24 Nov 21: No class — LL traveling to NDS Symposium in NY

#25 Nov 26: Explicit semantic representations; intro to AMR

Assignments/announcements

Final exam: take-home, to be worked on individually, released Tuesday Dec 10th (watch your mail).
No class Tuesday Dec 10th (ACL deadline recovery)
Tentative plan for assignment A4: light,released sometime Tuesday Dec 3, due Dec 10th.
Optional reading for next lecture: skim the following 2018 presentation slides, by Groschwitz et al.

Class images, links and handouts

Handout
AMREager, a demo

References

Chapter 16-16.4.1 (inclusive) of Jurafsky and Martin 3rd edition, Logical Representations of Sentence Meaning
Johan Bos, Katja Markert. 2005. Recognising Textual Entailment with Logical Inference. HLT-EMNLP, 628--635.

Nov 28: No class — Thanksgiving Break

#26 Dec 3: AMR

Assignments/announcements

Class images, links and handouts

Handout 1; Handout 2

References

Unified Verb Index, including PropBank. Contact person: Martha Palmer, University of Colorado.
Artzi, Yoav, Kenton Lee, and Luke Zettlemoyer. 2015. Broad-coverage CCG semantic parsing with AMR. EMNLP. Best paper award. [code] [video]
Banarescu, Laura, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. 2013. Abstract Meaning Representation for sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, 178–186. Sofia, Bulgaria: Association for Computational Linguistics.
Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, Nathan Schneider. May 1, 2019. Abstract Meaning Representation (AMR) 1.2.6 Specification.
Koller, Alexander, Stephan Oepen, and Weiwei Sun. 2019. Graph-Based Meaning Representations: Design and Processing. Tutorial at ACL 2019.
Liu, Fei, Jeffrey Flanigan, Sam Thomson, Norman Sadeh, and Noah A. Smith. 2015. “Toward Abstractive Summarization Using Semantic Representations.” In Proceedings of NAACL.
Zhang, Sheng and Ma, Xutai and Duh, Kevin and Van Durme, Benjamin. 2019. AMR parsing as sequence-to-graph transduction. ACL, 80--94.

#27 Dec 5: AMR parsing: Zhang, Ma, Duh and van Durme, EMNLP 2019

Assignments/announcements

Final take-home due date of Thursday Dec 19, 4:30pm.
A4 due time moved to Tuesday, Dec 10, 11:59 PM (extra 12 hours), with the usual lecture time on the 10th converted to optional office hours, in the usual classroom.

Class images, links and handouts

Handout

References

Semantic Dependency Parsing (SDP) webpage. Includes DELPH-IN MRS-Derived Semantic Dependencies, or DM (MRS = Minimal Recursion Semantics).
The UCCA Resource Webpage, maintained by Omri Abend
Groschwitz, Jonas, Matthias Lindemann, Meaghan Fowlie, Mark Johnson, and Alexander Koller. 2018. AMR dependency parsing with a typed semantic algebra. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1831–1841. Melbourne, Australia: Association for Computational Linguistics.
Zhang, Sheng, Xutai Ma, Kevin Duh, and Benjamin Van Durme. 2019. Broad-Coverage Semantic Parsing as Transduction. EMNLP-IJCNLP, 3777–89.

#28 Dec 10: Office drop-in hours in the usual lecture room (attendance not required, no sign-up required, just come by if you want.)

Dec 19, 4:30pm: Final take-home exam due (this date is what is listed on the registrar exam schedule as of December fourth)

Code for generating the calendar formatting adapted from Andrew Myers.