jordan

Department of Computer Science Colloquium
Tuesday March 26th, 2002 at 4:15pm
Upson Hall B17

Algorithms for Probabilistic Inference and Learning on Graphs

Michael I. Jordan
Division of Computer Science and Department of Statistics
University of California, Berkeley

I will give a talk in three parts.

In the first part, I describe a hierarchical, latent variable graphical model for collections of discrete data. The model posits that each "document" in the collection is generated as a mixture of "topics", where the continuous-valued mixture proportions are distributed as a latent Dirichlet variable. Inference and learning are carried out efficiently via variational algorithms. I present empirical results on applications to text modeling, collaborative filtering, text classification, and text/image modeling.

In the second part of the talk, I present a class of algorithms for Independent Component Analysis (ICA) which use contrast functions based on canonical correlations in a reproducing kernel Hilbert space. On the one hand, these contrast functions are related to mutual information and have desirable mathematical properties as measures of statistical dependence. On the other hand, building on recent developments in kernel methods, these criteria and their derivatives can be computed efficiently. Minimizing these criteria leads to flexible and robust algorithms for ICA. I illustrate with simulations involving a wide variety of source distributions, showing that our algorithms outperform presently known algorithms.

The final part of the talk will focus on a new spectral clustering algorithm. The algorithm can be analyzed using tools from matrix perturbation theory, yielding conditions under which it can be expected to perform well. Experimental results are presented on a number of challenging clustering problems.

[Joint work with Francis Bach, David Blei, Andrew Ng and Yair Weiss].