Numerical Mathematics Challenges in Machine Learning

Robust PCA and Collaborative Filtering: Rejecting Outliers, Identifying Manipulators

author: Constantine Caramanis, Department of Electrical and Computer Engineering, University of Texas at Austin
published: Jan. 13, 2011, recorded: December 2010, views: 11607

Slides

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Lecture popularity: You need to login to cast your vote.

Description

Principal Component Analysis is one of the most widely used techniques for dimensionality reduction. Nevertheless, it is plagued by sensitivity to outliers; finding robust analogs, particularly for high-dimensional data, is critical. We discuss the challenges posed by the high dimensional setting, where dimensionality is of the same order, or greater, than the number of samples. We detail why existing techniques fail -- indeed, no known algorithm can provide provable bounds to any constant fraction of outliers -- and then present two very different algorithms for High Dimensional Robust PCA. Our first algorithm achieves a breakdown point of 50% -- the best possible using any algorithm, and a stark improvement from the previous best-known result of 0%. Our second algorithm is based on ideas from convex optimization, and in addition to recovering the principal components, is also able to identify the corrupted points. We extend this to the partially observed setting, significantly extending matrix completion results to the setting of corrupted rows or columns.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

Comment:
Name:
Email address:
URL:

make sure you have javascript enabled or clear this field: