Improved Testing of Low Rank Matrices
published: Oct. 7, 2014, recorded: August 2014, views: 1450
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
We study the problem of determining if an input matrix A εRm x n can be well-approximated by a low rank matrix. Specifically, we study the problem of quickly estimating the rank or stable rank of A, the latter often providing a more robust measure of the rank. Since we seek significantly sublinear time algorithms, we cast these problems in the property testing framework. In this framework, A either has low rank or stable rank, or is far from having this property. The algorithm should read only a small number of entries or rows of A and decide which case A is in with high probability. If neither case occurs, the output is allowed to be arbitrary. We consider two notions of being far: (1) A requires changing at least an ε-fraction of its entries, or (2) A requires changing at least an ε-fraction of its rows. We call the former the "entry model" and the latter the "row model". We show:
For testing if a matrix has rank at most d in the entry model, we improve the previous number of entries of A that need to be read from O(d2/ε2) (Krauthgamer and Sasson, SODA 2003) to O(d2/ε). Our algorithm is the first to adaptively query the entries of A, which for constant d we show is necessary to achieve O(1/ε) queries.
- For the important case of d = 1 we also give a new non-adaptive algorithm, improving the previous O(1/ε2) queries to O(log2(1/ε) / ε).
- For testing if a matrix has rank at most d in the row model, we prove an Ω(d/ε) lower bound on the number of rows that need to be read, even for adaptive algorithms. Our lower bound matches a non-adaptive upper bound of Krauthgamer and Sasson.
- For testing if a matrix has stable rank at most d in the row model or requires changing an ε/d-fraction of its rows in order to have stable rank at most d, we prove that reading θ(d/ε2) rows is necessary and sufficient.
We also give an empirical evaluation of our rank and stable rank algorithms on real and synthetic datasets.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !