Workshop on On‐lineTrading of Exploration and Exploitation 2, Washington 2011

Explorations in Computer Go, Web Search, and Online Advertising

author: Thore Graepel, Microsoft Research, Cambridge, Microsoft Research
published: July 25, 2011, recorded: July 2011, views: 4309

Slides

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Lecture popularity: You need to login to cast your vote.

Description

In computer go, the goal is to find a good move in a given position by exploring the associated game tree, which is far too large to enumerate and hence requires sophisticated methods for navigation. I will discuss our combined Monte-Carlo Tree Search and Bayesian Pattern Ranking approach to accomplish this task under severe resource constraints in the go engine of the Xbox Live Arcade title The Path of Go.

In web search, decisions about ranking documents can depend on explicit feedback from external judges and on implicit feedback from users through their interaction with search results. In the latter case, user feedback about a query-URL pair can only be obtained if the URL is actually shown to the user, i.e., exploring it. I will discuss bandit approaches to address this problem, and the need to present diverse search results.

Paid search advertising differs from web search in a number of ways, including a smaller, better curated set of documents (ads) and an auction driven allocation mechanism explicitly based on click-through rate. I will describe the adPredictor system used for click-through rate prediction today, and discuss the causal loop created by the fact that only ads shown can be clicked on by the user and are hence available as training examples in the future.

I will discuss possible approaches to exploration vs exploitation in practice, and mention some of the open problems arising in such large-scale closed-loop systems. People involved in this work include Filip Radlinski, Joaquin Quiñonero Candela, Ralf Herbrich, and David Stern.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

Comment:
Name:
Email address:
URL:

make sure you have javascript enabled or clear this field: