[Feature Request] independently configurable learning rates for actor and critic · DLR-RM/stable-baselines3#338

(11 评论) (1 反应) (0 负责人)Python (1,407 fork)batch import

enhancementhelp wanted

仓库指标

Star: (6,550 star)
PR 合并指标: (平均合并 11天 13小时) (30 天内合并 3 个 PR)

描述

🚀 Feature

independently configurable learning rates for actor and critic in AC-style algorithms

Motivation

In literature the actor is often configured to learn slower, such that the critics responses are more reliable. At least it would be nice if i could allow my hyperparameter optimizer to decide which learning rates he wants to use for actor or critic.

Pitch

https://github.com/DLR-RM/stable-baselines3/blob/65100a4b040201035487363a396b84ea721eb027/stable_baselines3/ddpg/ddpg.py#L12-L26

Additional context

https://spinningup.openai.com/en/latest/algorithms/ddpg.html#documentation-pytorch-version

贡献者指南

研究方向: 修改 DDPG 类以接受演员和评论家各自独立的学习率，遵循现有参数的设置模式。请参考链接的代码部分和 spinningup 文档。
技术栈: pythonpytorch
领域: machine learningai
议题类型: 功能
难度: 2
预计时间: 半天
活动状态: 活跃
清晰度: 清晰
前置要求: PythonGitPyTorch
新手友好度: 75