pytorch/examples

Actor critic example not using discount rate properly

Open

#744 创建于 2020年3月27日

在 GitHub 查看
 (3 评论) (0 反应) (0 负责人)Python (21,634 star) (9,429 fork)batch import
good first issuetriaged

描述

The Actor Critic example (which is actually an implementation of REINFORCE-with-baseline as pointed out in https://github.com/pytorch/examples/issues/573), does not use the discount rate properly.

The loss should include \gamma ^ t, as shown in the box on page 330 of Sutton & Barto:

image

贡献者指南

Actor critic example not using discount rate properly · pytorch/examples#744 | Good First Issue