Many applications in data mining and machine
learning require the analysis of large two-dimensional matrices. Depending
on the application, these matrices can have very different characteristics:
in text mining, co-occurrence matrices are large, sparse and non-negative
while DNA microarray analysis yields smaller, dense matrices with positive
as well as negative entries. In data analysis, it is often desirable (a) to
find "low-parameter" matrix approximations, and (b) to "co-cluster" such
matrices, i.e., simultaneously cluster rows as well as columns.
In this talk, I will discuss a framework that inextricably links a certain
class of matrix approximations with co-clustering. The approximation error
can be measured using a non-trivial class of distortion measures, called
Bregman divergences, that have connections to the exponential family of
probability distributions. Our algorithms allow us to handle a wide variety
of matrices and distortion measures within a unifying framework, and are
able to efficiently construct good co-clusterings and the corresponding
low-parameter matrix approximations. I will conclude by presenting
experimental results on text and microarray data. Extensions to tensors will
be briefly discussed.
|