28th Annual Conference on Learning Theory (COLT), Paris 2015

From Averaging to Acceleration, There is Only a Step-size

author: Nicolas Flammarion, INRIA - SIERRA project-team
published: Aug. 20, 2015, recorded: July 2015, views: 1841

Slides

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Lecture popularity: You need to login to cast your vote.

Description

We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for non-strongly convex problems may be reformulated as constant parameter second-order difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n^2) where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system, showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggest an alternative algorithm (i.e., with different step sizes) that exhibits the good aspects of both averaging and acceleration.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

Comment:
Name:
Email address:
URL:

make sure you have javascript enabled or clear this field: