- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Spring 2025 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University / Cornell Tech - High School Programming Workshop and Contest 2025
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Robotics Ph. D. prgram
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Imitating Experts with Privileged Information (via Zoom)
Abstract: Imitation learning is a flexible paradigm for programming robots implicitly through demonstrations, interventions or preferences. However, often times the expert has access to context that is hidden from the learner. For instance, in self-driving, the human expert has richer context about the scene than the limited perception system of the car. While a common solution is to add a history of past states and actions to the model, practitioners have often noted that off-policy methods lead to a “latching effect” where the learner simply repeats the past action. On the other hand, on-policy approaches that leverage interaction with the demonstrator or the environment are able to match expert performance in the limit. We study this question and show a sharp phase transition in performance of off-policy approaches in contrast to uniformly good performance of on-policy approaches. We believe that this strong separation helps explain the variable performance of behavior cloning, even in regimes with large data and powerful model classes, and the consistent success of on-policy methods across domains like search, self-driving and mobile manipulation.
Bio: Sanjiban Choudhury is an Assistant Professor in the Department of Computer Science at Cornell University and a Research Scientist at Aurora Innovation. His research goal is to enable robots to work seamlessly alongside human partners in the wild. To this end, his work focuses on imitation learning, decision making and human-robot interaction. He did his Ph.D. in Robotics from Carnegie Mellon University and was a Postdoctoral fellow at the University of Washington. His research has received best paper awards at ICAPS 2019, finalist for IJRR 2018, and AHS 2014, and winner of the 2018 Howard Hughes award. He is a Siebel Scholar, class of 2013.