Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees

author: Pravesh Kothari, Department of Computer Science, University of Texas at Austin
published: May 15, 2014,   recorded: June 2013,   views: 3236
Categories

Slides

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Delicious Bibliography

Description

We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube {0,1}n. Our main result is the following structural theorem: any submodular function is ϵ-close in ℓ2 to a real-valued decision tree (DT) of depth O(1/ϵ2). This immediately implies that any submodular function is ϵ-close to a function of at most 2O(1/ϵ2) variables and has a spectral ℓ1 norm of 2O(1/ϵ2). It also implies the closest previous result that states that submodular functions can be approximated by polynomials of degree O(1/ϵ2) (Cheraghchi et al., 2012). Our result is proved by constructing an approximation of a submodular function by a DT of rank 4/ϵ2 and a proof that any rank-r DT can be ϵ-approximated by a DT of depth 52(r+log(1/ϵ)).

We show that these structural results can be exploited to give an attribute-efficient PAC learning algorithm for submodular functions running in time O(n2)⋅2O(1/ϵ4). The best previous algorithm for the problem requires nO(1/ϵ2) time and examples (Cheraghchi et al., 2012) but works also in the agnostic setting. In addition, we give improved learning algorithms for a number of related settings.

We also prove that our PAC and agnostic learning algorithms are essentially optimal via two lower bounds: (1) an information-theoretic lower bound of 2Ω(1/ϵ2/3) on the complexity of learning monotone submodular functions in any reasonable model (including learning with value queries); (2) computational lower bound of nΩ(1/ϵ2/3) based on a reduction to learning of sparse parities with noise, widely-believed to be intractable. These are the first lower bounds for learning of submodular functions over the uniform distribution.

See Also:

Download slides icon Download slides: colt2013_kothari_trees_01.pdf (4.1 MB)


Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: