Learning from Logged Interventions

author: Thorsten Joachims, Department of Computer Science, Cornell University
published: Dec. 1, 2017,   recorded: August 2017,   views: 877
Categories

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Delicious Bibliography

Description

Every time a system places an ad, presents a search ranking, or makes a recommendation, we can think about this as an intervention for which we can observe the user's response (e.g. click, dwell time, purchase). Such logged intervention data is actually one of the most plentiful types of data available, as it can be recorded from a variety of systems (e.g., search engines, recommender systems, ad placement) at little cost. However, this data provides only partial-information feedback - aka "bandit feedback" - limited to the particular intervention chosen by the system. We don't get to see how the user would have responded, if we had chosen a different intervention. This makes learning from logged bandit feedback substantially different from conventional supervised learning, where "correct" predictions together with a loss function provide full-information feedback.

In this talk, I will explore approaches and methods for batch learning from logged bandit feedback (BLBF). Unlike the well-explored problem of online learning with bandit feedback, batch learning with bandit feedback does not require interactive experimental control of the underlying system, but merely exploits logged intervention data collected in the past. The talk presents a new inductive principle for BLBF, new counterfactual risk estimators, and new methods for structured output prediction with BLBF with applications to ad placement.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: