Preconditioned Temporal Difference Learning
author: Hengshuai Yao,
School of Creative Media, City University of Hong Kong
published: Aug. 12, 2008, recorded: July 2008, views: 3268
published: Aug. 12, 2008, recorded: July 2008, views: 3268
Slides
Related content
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Description
This paper extends many of the recent popular reinforcement learning (RL) algorithms to a generalized framework that includes least-squares temporal difference (LSTD) learning, least-squares policy evaluation (LSPE) and a variant of incremental LSTD (iLSTD). The basis of this extension is a preconditioning technique that tries to solve a stochastic model equation. This paper also studies three signicant issues of the new framework: it presents a new rule of step-size that can be computed online, provides an iterative way to apply preconditioning, and reduces the complexity of related algorithms to near that of temporal difference (TD) learning.
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !
Write your own review or comment: