[Feature Request] independently configurable learning rates for actor and critic · DLR-RM/stable-baselines3#338

(11 comments) (1 reaction) (0 assignees)Python (1,407 forks)batch import

enhancementhelp wanted

Repository metrics

Stars: (6,550 stars)
PR merge metrics: (平均マージ 11d 13h) (30d で 3 merged PRs)

説明

🚀 Feature

independently configurable learning rates for actor and critic in AC-style algorithms

Motivation

In literature the actor is often configured to learn slower, such that the critics responses are more reliable. At least it would be nice if i could allow my hyperparameter optimizer to decide which learning rates he wants to use for actor or critic.

Pitch

https://github.com/DLR-RM/stable-baselines3/blob/65100a4b040201035487363a396b84ea721eb027/stable_baselines3/ddpg/ddpg.py#L12-L26

Additional context

https://spinningup.openai.com/en/latest/algorithms/ddpg.html#documentation-pytorch-version

コントリビューターガイド

調査方針: DDPGクラスを修正して、アクターとクリティックに個別の学習率を受け付けるようにします。既存パラメータのパターンに従ってください。リンク先のコードセクションとspinningupドキュメントを参照してください。
技術スタック: pythonpytorch
領域: machine learningai
Issue 種別: 機能
難度: 2
推定時間: 半日
活動状況: アクティブ
明確さ: 明確
前提条件: PythonGitPyTorch
初心者向け度: 75