- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Spring 2025 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University / Cornell Tech - High School Programming Workshop and Contest 2025
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Robotics Ph. D. prgram
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Abstract:
At the dawn of the computer age in the 1960s, Bellman and his collaborators found it beneficial to use what is now called linear function approximation to address certain multistage stochastic planning problems. Their approach was straightforward: use linear value function approximation to avoid state-space discretization, thereby maintaining polynomial-time computation while also controlling accuracy. However, the question of when and how this approach is feasible has eluded researchers for over 50 years, even as the prospect of using function approximation to overcome the curse of dimensionality has continued to fuel much of the excitement around reinforcement learning. Early results focused on connecting the approximation spaces with the structure of the underlying problem, and some indicated that it might not be enough for the target function (such as the optimal value function) to simply lie within this space. As it turns out, the emerging picture of when function approximation can help in multistage problems is intricate. In this talk, we will explore recent results, primarily from my group, that contribute to this complex understanding. I will conclude with an outlook on current research directions.
Bio:
Csaba Szepesvári (Ph.D. '99) is the team lead of DeepMind's Foundation team, in addition to serving as a Professor of Computing Science of the University of Alberta, and as a Principal Investigator of the Alberta Machine Intelligence Institute. Prof. Szepesvári is best known for his theoretical work on reinforcement learning, and as the co-inventor of a Monte-Carlo tree search method "UCT", which inspired much work in the AI community. Prof. Szepesvári has published three books, one in control theory, one about reinforcement learning, and the most recent specializing to the theory of bandit algorithms. Prof. Szepesvári currently serves on the editorial board of the Journal of Machine Learning Research, the Mathematics of Operations Research, and the Foundations and Trends in Machine Learning. He was a co-chair for ICML in 2022, while earlier he co-chaired COLT and ALT. Since 2021, Prof. Szepesvári enthusiastically co-organizes and co-hosts the weekly "Reinforcement Learning Theory" virtual seminar that is aimed at everyone in the world who want to learn about the latest advances in this fast moving field.