NIPS Workshops, Lake Tahoe 2013

Fast rates for the multi-armed bandit

author: Sébastien Bubeck, Department of Operations Research and Financial Engineering, Princeton University
published: Oct. 6, 2014, recorded: December 2013, views: 2212

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Lecture popularity: You need to login to cast your vote.

Description

Since the seminal work of Lai and Robbins (1985) we know bandit strategies with normalized regret of order (i) 1/sqrt(T) for any stochastic bandit, and (ii) log(T) / T for 'benign' distributions. In Bubeck and Slivkins (2012) we designed a new strategy which extends property (i) to adversarial bandits while still having the fast rate given in (ii). I will present this algorithm and I will also discuss the possibility of even faster rates of order 1/T when extra information is available.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

Comment:
Name:
Email address:
URL:

make sure you have javascript enabled or clear this field:

Fast rates for the multi-armed bandit

See Also:

Related content

Report a problem or upload files

Description

Link this page

Write your own review or comment: