A relative-error CUR decomposition for matrices and its data applications
Michael W. Mahoney
Yahoo Research
Much recent work in the Theoretical
Computer Science, Linear Algebra, and Machine Learning has considered matrix
decompositions of the following form: given an m-by-n matrix A, decompose it as
a product of three matrices, C, U, and R, where C consists of a few (typically a
constant number of) columns of A, R consists of a few (typically a constant
number of) rows of A, and U is a small carefully constructed matrix that
guarantees that the product CUR is "close" to A. Applications of such
decompositions include the computation of compressed "sketches" for large
matrices in a pass-efficient manner, matrix reconstruction, speeding up
kernel-based statistical learning computations, sparsity-preservation in
low-rank approximations, and improved interpretability of data analysis methods.
In this talk we shall discuss various choices for the matrices C, U, and R that
are appropriate in different application domains. The main result will be an
algorithm that computes matrices C, U, and R such that the (Frobenius) norm of
the error matrix A - CUR is almost minimal. Although this talk will build on
ideas from the tutorial given earlier in the day, this talk will be entirely
self-contained.