Regret Bounds for the Adaptive Control of Linear Quadratic Systems

author: Csaba Szepesvári, Department of Computing Science, University of Alberta
published: Aug. 2, 2011,   recorded: July 2011,   views: 3792
Categories

Slides

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Delicious Bibliography

Description

We study the average cost Linear Quadratic (LQ) problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that its regret up to time T is O(√T) apart from logarithmic factors. Unlike many classical approaches that use a forced-exploration scheme to provide the suffi cient exploratory information for parameter estimation, we construct a high-probability con fidence set around the model parameters and design an algorithms that plays optimistically with respect to this con fidence set. The construction of the con fidence set is based on the new results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm. To best of our knowledge this is the the fi rst time that a regret bound is derived for the LQ problem.

See Also:

Download slides icon Download slides: colt2011_szepesvari_regret_01.pdf (4.5 MB)


Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: