[Feature Request] independently configurable learning rates for actor and critic · DLR-RM/stable-baselines3#338

(11 留言) (1 反應) (0 負責人)Python (1,407 fork)batch import

enhancementhelp wanted

倉庫指標

Star: (6,550 star)
PR 合併指標: (平均合併 11天 13小時) (30 天內合併 3 個 PR)

描述

🚀 Feature

independently configurable learning rates for actor and critic in AC-style algorithms

Motivation

In literature the actor is often configured to learn slower, such that the critics responses are more reliable. At least it would be nice if i could allow my hyperparameter optimizer to decide which learning rates he wants to use for actor or critic.

Pitch

https://github.com/DLR-RM/stable-baselines3/blob/65100a4b040201035487363a396b84ea721eb027/stable_baselines3/ddpg/ddpg.py#L12-L26

Additional context

https://spinningup.openai.com/en/latest/algorithms/ddpg.html#documentation-pytorch-version

貢獻者指南

研究方向: 修改 DDPG 類別以接受演員和評論家各自獨立的学习率，遵循現有參數的設置模式。請參考連結的程式碼部分和 spinningup 文件。
技術棧: pythonpytorch
領域: machine learningai
議題類型: 功能
難度: 2
預計時間: 半天
活動狀態: 活躍
清晰度: 清晰
前置要求: PythonGitPyTorch
新手友善度: 75