Artificial Intelligence Seminar

Spring 2010
Friday 12:00-1:15
Upson 5130

The AI seminar will meet weekly for lectures by graduate students, faculty, and researchers emphasizing work-in-progress and recent results in AI research. Lunch will be served starting at noon, with the talks running between 12:15 and 1:15. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.

Date

Title/Speaker/Abstract/Host

February 5
Yisong Yue , Cornell University
Host: Thorsten Joachims

Title: New Learning Frameworks for Information Retrieval

Abstract: (This is a practice job talk.)

Information retrieval has become a central technology in managing and leveraging the ongoing explosion of digital content. Current techniques for designing retrieval models are limited by two issues. First, they have restricted representational power, and generally deal with simple settings that estimate the quality of individual results independently of other results. Second, existing methodologies for designing retrieval functions are labor intensive and cannot be efficiently applied to accommodate a growing variety of retrieval domains.

In this talk, I will describe two learning approaches for designing new retrieval models. The first is a structured prediction approach, which considers inter-dependencies between results in order to optimize for more sophisticated objectives such as information diversity. The second is an interactive learning approach, which reduces the efficiency bottleneck of relying on human experts by leveraging data gathered from online user interactions -- such data is both cheap to collect as well as naturally representative of user utilities in the target domain.

February 12
HOLD DATE
Host: Jon Kleinberg

TBA

February 19
Itai Ashlagi , Harvard
Host: Bobby Kleinberg

Title:
New Incentive Issues in Kidney Exchange

Joint work with Alvin Roth

Abstract:
As kidney exchange has grown, new problems and opportunities having to do with incentives have come up.
First, the participation of multiple transplant centers leads to strategic behavior on the part of transplant centers, which, because they have multiple patients, have different incentives and strategic choices than individual patients and their surgeons.
Second, altruistic donors make it possible to consider non-simultaneous transplant chains, which can result in a broken chain if a donor reneges once his patient has received a kidney, but which have the possibility of producing many more transplants than can be done simultaneously.

�The AI-Seminar is sponsored by Yahoo!�

February 26
CANCELLED DUE TO WEATHER

March 5
Speaker: Warren B. Powell, Director, CASTLE Laboratory, Princeton University

Seminar: AI / Computational Sustainability seminar

Room: Room Change- 117 Upson Hall

Host: Carla Gomes

Title: Opportunities for Machine Learning in Stochastic Optimization, with Applications in Energy Resource Planning

Abstract: Energy resource planning spans problems such as optimization of technology R&D portfolios, optimizing the control of storage in the presence of intermittent energy supply, encouraging market adoption using tax policies, and planning energy investments over long-range horizons. Each of these produces complex, multistage stochastic optimization problems which challenge existing algorithmic tools. Approximate dynamic programming offers a modeling and algorithmic framework that scales to these complex problems, but robust, provably convergent algorithms are not available. In this talk, I will describe how ADP can be used to solve problems with high-dimensional state, action and outcome spaces, while highlight three technical challenges. First is the need for general purpose machine learning tools, which may be solved using recent advances in Dirichlet-process mixture models. Second is the need for efficient recursive updating strategies in the learning process, where I will describe a new optimal stepsize formula for approximate value iteration. Finally, I will describe our new line of research in optimal learning which can be used for policy optimization and, we hope, solving the exploration vs. exploitation problem.

Bio: A faculty member at Princeton University since 1981, Professor Powell specializes in stochastic optimization problems arising in a variety of resource allocation problems, with applications encompassing energy resource modeling, transportation, military operations, health and finance. He is the director of CASTLE Laboratory, which has developed planning systems for a wide range of operational problems. He has authored or coauthored over 140 refereed publications, and he is the author of Approximate Dynamic Programming: Solving the curses of dimensionality , published by John Wiley and Sons. His research spans stochastic optimization and the closely related area of optimal learning, which addresses the problem of efficiently collecting information. A recipient of the Informs Fellows Award, Professor Powell has served in a variety of editorial and administrative positions for Informs, including Informs Board of Directors, Area Editor for Operations Research, President of the Transportation Science Section, and numerous prize and administrative committees.

�The AI-Seminar is sponsored by Yahoo!�

March 12
Speaker: Yali Amit, UChicago, Departments of Statistics and Computer Science

Host: Ashutosh Saxena

Title: Generative models for scene annotation

Abstract : The goal of Computer Vision is the automatic annotation of scenes containing multiple occluded objects as well as noise and clutter. Recent work has focused on two main tasks. The first is the classification among object classes in segmented images containing only one object and the second is the detection of a particular object class in a large image. Both tasks have been primarily addressed using discriminative learning.

It is not clear however how these methods can extend to deal with the recognition of multiple object classes in images containing a number of objects in a wide range of configurations.

I will present an approach which starts from simple statistical models for individual objects. With these models the important notion of invariance can be clearly formulated.

Furthermore the individual object models can be composed to define models for object configurations. Decisions are likelihood based and do not depend on pre-trained decision boundaries.

I will briefly discuss some computational strategies for computing the scene annotation, show some applications, and describe some major difficulties we face in making further progress.

: �The AI-Seminar is sponsored by Yahoo!�

March 19
Speaker: Andreas Krause , Caltech
Host: Carla Gomes

Seminar: AI / Computational Sustainability seminar

Room: Room Change ** 253 RHODES**

Title: Optimizing Sensing from Water to the Web

Abstract: Where should we place sensors to quickly detect contamination in drinking water distribution networks? Which blogs should we read to learn about the biggest stories on the web? These problems share a fundamental challenge: How can we obtain the most useful information about the state of the world, at minimum cost?

Such sensing problems are typically NP-hard, and were commonly addressed using heuristics without theoretical guarantees about the solution quality. In this talk, I will present algorithms which efficiently find provably near-optimal solutions to large, complex sensing problems. Our algorithms exploit submodularity, an intuitive notion of diminishing returns, common to many sensing problems; the more sensors we have already deployed, the less we learn by placing another sensor. To quantify the uncertainty in our predictions, we use probabilistic models, such as Gaussian Processes. In addition to identifying the most informative sensing locations, our algorithms can handle more challenging settings, where sensors need to be able to reliably communicate over lossy links, where mobile robots are used for collecting data or where solutions need to be robust against adversaries, sensor failures and dynamic environments.

I will also present results applying our algorithms to several real-world sensing tasks, including environmental monitoring using robotic sensors, activity recognition using a built sensing chair, deciding which blogs to read on the web, and a sensor placement competition.

Bio: Andreas Krause is an assistant professor of Computer Science at the California Institute of Technology. He received his Ph.D. from Carnegie Mellon University in 2008. Krause is a recipient of the NSF CAREER award and the Okawa Foundation Research Grant recognizing top young researchers in telecommunications. His research on sensor placement and optimized information gathering received awards at several premier conferences, as well as the best research paper award of the ASCE Journal of Water Resources Planning and Management.

�The AI-Seminar is sponsored by Yahoo!�

March 26
NO SEMINAR- Spring Break

�The AI-Seminar is sponsored by Yahoo!�

April 2 NO SEMINAR

April 9
Speaker: Steven Phillips, (AT&T Labs--Research )

Host: Carla Gomes

Bio : Steven Phillips received his PhD in Computer Science under Rajeev Motwani at Stanford, and has spent the past 16 years at AT&T Research. His research has been focused on computation aspects of conservation biology since 2002.

Title : Voting Power and Site Prioritization

Abstract : Indices for site prioritization are widely used to address the question: which sites are most important for conservation of biodiversity? We investigate the theoretical underpinnings of target-based prioritization, which measures sites' contribution to achieving predetermined conservation targets. We show a strong connection between site prioritization and the mathematical theory of voting power. Well-known paradoxes of voting power afflict current site prioritization indices; by negating such paradoxes, we develop a set of intuitive axioms that an index should obey. We introduce a simple new index, ``fraction-of-spare,'' that satisfies all the axioms. In an evaluation involving multi-year scheduling of site acquisitions for conservation of forest types in New South Wales under specified clearing rates, fraction-of-spare outperforms 52 existing prioritization indices. We also compute the optimal schedule of acquisitions (under the assumed clearing rates) using mathematical programming, which indicates that there is still potential for improvement in site prioritization for conservation scheduling.

Joint work with Aaron Archer, Robert L. Pressey, Desmond Torkornoo, David Applegate, David Johnson, Matthew E. Watts

�The AI-Seminar is sponsored by Yahoo!�

April 16

April 23
Speakers: Cristian Danescu-Niculescu-Mizil & Yisong Yue, Cornell University

Host: Lillian Lee

Title for Cristian's Talk: Competing for users' attention: On the interplay between organic and sponsored search results.

Abstract: Queries on major Web search engines produce complex result pages, primarily composed of two types of information: organic results, that is, short descriptions and links to relevant Web pages, and sponsored search results, the small textual advertisements often displayed above or to the right of the organic results. Strategies for optimizing each type of result in isolation and the consequent user reaction have been extensively studied; however, the interplay between these two complementary sources of information has been ignored, a situation we aim to change. Our findings indicate that their perceived relative usefulness (as evidenced by user clicks) depends on the nature of the query. Specifically, we found that, when both sources focus on the same intent, for navigational queries there is a clear competition between ads and organic results, while for non-navigational queries this competition turns into synergy.

We also investigate the relationship between the perceived usefulness of the ads and their textual similarity to the organic results, and propose a model that formalizes this relationship. To this end, we introduce the notion of responsive ads, which directly address the user's information need, and incidental ads, which are only tangentially related to that need. Our findings support the hypothesis that in the case of navigational queries, which are usually fully satisfied by the top organic result, incidental ads are perceived as more valuable than responsive ads, which are likely to be duplicative. On the other hand, in the case of non-navigational queries, incidental ads are perceived as less beneficial, possibly because they diverge too far from the actual user need.

We hope that our findings and further research in this area will allow search engines to tune ad selection for an increased synergy between organic and sponsored results, leading to both higher user satisfaction and better monetization.

This is joint work with: Andrei Broder, Evgeniy Gabrilovich, Vanja Josifovski and Bo Pang.

Title for Yisong's Talk: Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data

Abstract: Leveraging clickthrough data has become a popular approach for evaluating and optimizing information retrieval systems. Although data is plentiful, one must take care when interpreting clicks, since user behavior can be affected by various sources of presentation bias. While the issue of position bias in clickthrough data has been the topic of much study, other presentation bias effects have received comparatively little attention. For instance, since users must decide whether to click on a result based on its summary (e.g., the title, URL and abstract), one might expect clicks to favor "more attractive" results. In this study, we examine result summary attractiveness as a potential source of presentation bias. This study distinguishes itself from prior work by aiming to detect systematic biases in click behavior due to attractive summaries inflating perceived relevance. Our experiments conducted on the Google web search engine show substantial evidence of presentation bias in clicks towards results with more attractive titles.

This is joint work with Rajan Patel and Hein Roehrig.

�The AI-Seminar is sponsored by Yahoo!�

April 30

See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!

Date	Title/Speaker/Abstract/Host
February 5	Yisong Yue , Cornell University Host: Thorsten Joachims Title: New Learning Frameworks for Information Retrieval Abstract: (This is a practice job talk.) Information retrieval has become a central technology in managing and leveraging the ongoing explosion of digital content. Current techniques for designing retrieval models are limited by two issues. First, they have restricted representational power, and generally deal with simple settings that estimate the quality of individual results independently of other results. Second, existing methodologies for designing retrieval functions are labor intensive and cannot be efficiently applied to accommodate a growing variety of retrieval domains. In this talk, I will describe two learning approaches for designing new retrieval models. The first is a structured prediction approach, which considers inter-dependencies between results in order to optimize for more sophisticated objectives such as information diversity. The second is an interactive learning approach, which reduces the efficiency bottleneck of relying on human experts by leveraging data gathered from online user interactions -- such data is both cheap to collect as well as naturally representative of user utilities in the target domain.
February 12	HOLD DATE Host: Jon Kleinberg TBA
February 19	Itai Ashlagi , Harvard Host: Bobby Kleinberg Title: New Incentive Issues in Kidney Exchange Joint work with Alvin Roth Abstract: As kidney exchange has grown, new problems and opportunities having to do with incentives have come up. First, the participation of multiple transplant centers leads to strategic behavior on the part of transplant centers, which, because they have multiple patients, have different incentives and strategic choices than individual patients and their surgeons. Second, altruistic donors make it possible to consider non-simultaneous transplant chains, which can result in a broken chain if a donor reneges once his patient has received a kidney, but which have the possibility of producing many more transplants than can be done simultaneously. �The AI-Seminar is sponsored by Yahoo!�
February 26	CANCELLED DUE TO WEATHER
March 5	Speaker: Warren B. Powell, Director, CASTLE Laboratory, Princeton University Seminar: AI / Computational Sustainability seminar Room: Room Change- 117 Upson Hall Host: Carla Gomes Title: Opportunities for Machine Learning in Stochastic Optimization, with Applications in Energy Resource Planning Abstract: Energy resource planning spans problems such as optimization of technology R&D portfolios, optimizing the control of storage in the presence of intermittent energy supply, encouraging market adoption using tax policies, and planning energy investments over long-range horizons. Each of these produces complex, multistage stochastic optimization problems which challenge existing algorithmic tools. Approximate dynamic programming offers a modeling and algorithmic framework that scales to these complex problems, but robust, provably convergent algorithms are not available. In this talk, I will describe how ADP can be used to solve problems with high-dimensional state, action and outcome spaces, while highlight three technical challenges. First is the need for general purpose machine learning tools, which may be solved using recent advances in Dirichlet-process mixture models. Second is the need for efficient recursive updating strategies in the learning process, where I will describe a new optimal stepsize formula for approximate value iteration. Finally, I will describe our new line of research in optimal learning which can be used for policy optimization and, we hope, solving the exploration vs. exploitation problem. Bio: A faculty member at Princeton University since 1981, Professor Powell specializes in stochastic optimization problems arising in a variety of resource allocation problems, with applications encompassing energy resource modeling, transportation, military operations, health and finance. He is the director of CASTLE Laboratory, which has developed planning systems for a wide range of operational problems. He has authored or coauthored over 140 refereed publications, and he is the author of Approximate Dynamic Programming: Solving the curses of dimensionality , published by John Wiley and Sons. His research spans stochastic optimization and the closely related area of optimal learning, which addresses the problem of efficiently collecting information. A recipient of the Informs Fellows Award, Professor Powell has served in a variety of editorial and administrative positions for Informs, including Informs Board of Directors, Area Editor for Operations Research, President of the Transportation Science Section, and numerous prize and administrative committees. �The AI-Seminar is sponsored by Yahoo!�
March 12	Speaker: Yali Amit, UChicago, Departments of Statistics and Computer Science Host: Ashutosh Saxena Title: Generative models for scene annotation Abstract : The goal of Computer Vision is the automatic annotation of scenes containing multiple occluded objects as well as noise and clutter. Recent work has focused on two main tasks. The first is the classification among object classes in segmented images containing only one object and the second is the detection of a particular object class in a large image. Both tasks have been primarily addressed using discriminative learning. It is not clear however how these methods can extend to deal with the recognition of multiple object classes in images containing a number of objects in a wide range of configurations. I will present an approach which starts from simple statistical models for individual objects. With these models the important notion of invariance can be clearly formulated. Furthermore the individual object models can be composed to define models for object configurations. Decisions are likelihood based and do not depend on pre-trained decision boundaries. I will briefly discuss some computational strategies for computing the scene annotation, show some applications, and describe some major difficulties we face in making further progress. : �The AI-Seminar is sponsored by Yahoo!�
March 19	Speaker: Andreas Krause , Caltech Host: Carla Gomes Seminar: AI / Computational Sustainability seminar Room: Room Change 253 RHODES Title: Optimizing Sensing from Water to the Web Abstract: Where should we place sensors to quickly detect contamination in drinking water distribution networks? Which blogs should we read to learn about the biggest stories on the web? These problems share a fundamental challenge: How can we obtain the most useful information about the state of the world, at minimum cost? Such sensing problems are typically NP-hard, and were commonly addressed using heuristics without theoretical guarantees about the solution quality. In this talk, I will present algorithms which efficiently find provably near-optimal solutions to large, complex sensing problems. Our algorithms exploit submodularity, an intuitive notion of diminishing returns, common to many sensing problems; the more sensors we have already deployed, the less we learn by placing another sensor. To quantify the uncertainty in our predictions, we use probabilistic models, such as Gaussian Processes. In addition to identifying the most informative sensing locations, our algorithms can handle more challenging settings, where sensors need to be able to reliably communicate over lossy links, where mobile robots are used for collecting data or where solutions need to be robust against adversaries, sensor failures and dynamic environments. I will also present results applying our algorithms to several real-world sensing tasks, including environmental monitoring using robotic sensors, activity recognition using a built sensing chair, deciding which blogs to read on the web, and a sensor placement competition. Bio: Andreas Krause is an assistant professor of Computer Science at the California Institute of Technology. He received his Ph.D. from Carnegie Mellon University in 2008. Krause is a recipient of the NSF CAREER award and the Okawa Foundation Research Grant recognizing top young researchers in telecommunications. His research on sensor placement and optimized information gathering received awards at several premier conferences, as well as the best research paper award of the ASCE Journal of Water Resources Planning and Management. �The AI-Seminar is sponsored by Yahoo!�
March 26	NO SEMINAR- Spring Break �The AI-Seminar is sponsored by Yahoo!�
April 2	NO SEMINAR
April 9	Speaker: Steven Phillips, (AT&T Labs--Research ) Host: Carla Gomes Bio : Steven Phillips received his PhD in Computer Science under Rajeev Motwani at Stanford, and has spent the past 16 years at AT&T Research. His research has been focused on computation aspects of conservation biology since 2002. Title : Voting Power and Site Prioritization Abstract : Indices for site prioritization are widely used to address the question: which sites are most important for conservation of biodiversity? We investigate the theoretical underpinnings of target-based prioritization, which measures sites' contribution to achieving predetermined conservation targets. We show a strong connection between site prioritization and the mathematical theory of voting power. Well-known paradoxes of voting power afflict current site prioritization indices; by negating such paradoxes, we develop a set of intuitive axioms that an index should obey. We introduce a simple new index, ``fraction-of-spare,'' that satisfies all the axioms. In an evaluation involving multi-year scheduling of site acquisitions for conservation of forest types in New South Wales under specified clearing rates, fraction-of-spare outperforms 52 existing prioritization indices. We also compute the optimal schedule of acquisitions (under the assumed clearing rates) using mathematical programming, which indicates that there is still potential for improvement in site prioritization for conservation scheduling. Joint work with Aaron Archer, Robert L. Pressey, Desmond Torkornoo, David Applegate, David Johnson, Matthew E. Watts �The AI-Seminar is sponsored by Yahoo!�
April 16
April 23	Speakers: Cristian Danescu-Niculescu-Mizil & Yisong Yue, Cornell University Host: Lillian Lee Title for Cristian's Talk: Competing for users' attention: On the interplay between organic and sponsored search results. Abstract: Queries on major Web search engines produce complex result pages, primarily composed of two types of information: organic results, that is, short descriptions and links to relevant Web pages, and sponsored search results, the small textual advertisements often displayed above or to the right of the organic results. Strategies for optimizing each type of result in isolation and the consequent user reaction have been extensively studied; however, the interplay between these two complementary sources of information has been ignored, a situation we aim to change. Our findings indicate that their perceived relative usefulness (as evidenced by user clicks) depends on the nature of the query. Specifically, we found that, when both sources focus on the same intent, for navigational queries there is a clear competition between ads and organic results, while for non-navigational queries this competition turns into synergy. We also investigate the relationship between the perceived usefulness of the ads and their textual similarity to the organic results, and propose a model that formalizes this relationship. To this end, we introduce the notion of responsive ads, which directly address the user's information need, and incidental ads, which are only tangentially related to that need. Our findings support the hypothesis that in the case of navigational queries, which are usually fully satisfied by the top organic result, incidental ads are perceived as more valuable than responsive ads, which are likely to be duplicative. On the other hand, in the case of non-navigational queries, incidental ads are perceived as less beneficial, possibly because they diverge too far from the actual user need. We hope that our findings and further research in this area will allow search engines to tune ad selection for an increased synergy between organic and sponsored results, leading to both higher user satisfaction and better monetization. This is joint work with: Andrei Broder, Evgeniy Gabrilovich, Vanja Josifovski and Bo Pang. Title for Yisong's Talk: Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data Abstract: Leveraging clickthrough data has become a popular approach for evaluating and optimizing information retrieval systems. Although data is plentiful, one must take care when interpreting clicks, since user behavior can be affected by various sources of presentation bias. While the issue of position bias in clickthrough data has been the topic of much study, other presentation bias effects have received comparatively little attention. For instance, since users must decide whether to click on a result based on its summary (e.g., the title, URL and abstract), one might expect clicks to favor "more attractive" results. In this study, we examine result summary attractiveness as a potential source of presentation bias. This study distinguishes itself from prior work by aiming to detect systematic biases in click behavior due to attractive summaries inflating perceived relevance. Our experiments conducted on the Google web search engine show substantial evidence of presentation bias in clicks towards results with more attractive titles. This is joint work with Rajan Patel and Hein Roehrig. �The AI-Seminar is sponsored by Yahoo!�
April 30

Artificial Intelligence Seminar

Spring 2010 Friday 12:00-1:15 Upson 5130

Sponsored by

Spring 2010
Friday 12:00-1:15
Upson 5130