Dear all,

Our next AI seminar on " Estimating Long-term Rewards by Off-policy Reinforcement Learning" by Lihong Li is scheduled to be on October 27th, 1-2 PM PST. It will be followed by a 30 minute Q&A session by the graduate students.

Note that this is a zoom event.

Zoom Link: https://oregonstate.zoom.us/j/93591935144?pwd=YjZaSjBYS0NmNUtjQzBEdzhPeDZ5UT09

Estimating Long-term Rewards by Off-policy Reinforcement Learning

Lihong Li

Senior Principal Scientist
Amazon

Abstract: One of the core problems in reinforcement learning (RL) is estimating the long-term reward of a given policy. In many real-world applications such as healthcare, robotics and dialogue systems, running a new policy on users or robots can be costly or risky. This gives rise to the need for off-policy, or counterfactual, estimation: estimate the long-term reward of a given policy using data previously collected by another policy (e.g., the one currently deployed). This talk will describe some recent advances in this problem, for which many standard estimators suffer an exponentially large variance (known as "the curse of horizon"). Our approach is based on a dual linear program formulation of the long-term reward, and can be extended to estimate confidence intervals.

Bio: Lihong Li is a Senior Principal Scientist at Amazon. He obtained a PhD degree in Computer Science from Rutgers University. After that, he held research positions in Yahoo!, Microsoft and Google, before joining Amazon. His main research interests are in reinforcement learning, including contextual bandits, and other related problems in AI. His work is often inspired by applications in recommendation, advertising, Web search and conversational systems. Homepage: http://lihongli.github.io

Please watch this space for future AI Seminars :

https://eecs.oregonstate.edu/ai-events

The video recordings of the first 3 AI seminars are also linked from the corresponding seminar pages available at the above link.

Rajesh Mangannavar,

Graduate Student

Oregon State University

----
AI Seminar Important Reminders:
-> The AI Seminar has a strict "no electronics" and "no recordings" policy.
-> For graduate students in the AI program, attendance is strongly encouraged.