- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Fall 2024 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University - High School Programming Contests 2024
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Small-loss bounds for online learning with partial information
Abstract: I will discuss the problem of online learning with graph-based feedback in the adversarial (non-stochastic) setting. At each round, a decision maker selects an action from a finite set of alternatives and receives feedback based on a combinatorial graph-based feedback model introduced by Mannor and Shamir. This encapsulates as special cases important partial information paradigms such as bandits, online routing, and contextual bandits. An important challenge in such partial-feedback settings is that, in order to keep up with the non-stochasticity in the environment, the learner needs to explore often all actions, including suboptimal ones. I will tackle this hurdle by providing a general black-box reduction to attain effective regret guarantees that avoid this over-exploration.