- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Spring 2025 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University / Cornell Tech - High School Programming Workshop and Contest 2025
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Efficient Learning of Latent Variable Models
Abstract: We formulate a new geometric problem - of identifying a simplex K given data points obtained by perturbing latent points in K. It is a common generalization of several latent variable models including Topic Modeling (LDA), Mixed Membership Block models (MMBM), non-negative matrix factorization and others. We show that K is well approximated by a data-determined polytope K which admits a polynomial time optimization oracle. We then prove that successive vertices of K can be found by optimizing carefully chosen linear functions over K. The algorithm is simply stated and has running time matching the best previous algorithms. The proof of correctness involves new and existing tools from Numerical Analysis, Random Projections and convex geometry. In the last part of the talk, we describe how related techniques can be used to derive near-optimal sample complexity for LDA and MMBM.
Joint Work with Chiranjib Bhattacharyya.