[Enhancement]: Wrong gains for weight initialization · DLR-RM/stable-baselines3#1559

(2 comments) (0 reactions) (1 assignee)Python (1,407 forks)batch import

enhancementhelp wanted

Repository metrics

Stars: (6,550 stars)
PR merge metrics: (Avg merge 11d 13h) (3 merged PRs in 30d)

Description

Enhancement

The recommended gains for the weight init depend on the used activation function, see torch docs. However, as for now the used gains are statically implemented and always the same in ActorCriticPolicies. See here.

I recommend making the gains dependent on the activation function used(, i.e. probably mainly ReLU and tanh).

If you agree with this, I would like to implement it myself and PR.

Thanks and a good day!

To Reproduce

Relevant log output / Error message

--

System Info

Checklist

I have checked that there is no similar issue in the repo
I have read the documentation
I have provided a minimal working example to reproduce the bug
I've used the markdown code blocks for both code and stack traces.

Contributor guide

Research direction: Investigate the current weight initialization in ActorCriticPolicies, identify where gains are statically set, and modify to use activation function specific gains as per PyTorch documentation.
Tech stack: pythonpytorch
Domain: machine learningai
Issue type: Feature
Difficulty: 2
Estimated time: 1-3 hours
Activity status: Active
Clarity: Clear
Prerequisites: PythonPyTorchGit
Newbie friendliness: 75