scikit-learn/scikit-learn
在 GitHub 查看Bug in bfgs gradient computation of MLPRegressor with multiple output neurons
Open
#8,349 创建于 2017年2月13日
Bughelp wantedmodule:neural_network
描述
When implementing a special Neural Network based on MLPRegressor, I found the following problem when using bfgs training and multiple output neurons (I did not look into the other training methods):
- The 'squared_loss' implementation uses np.mean to compute the overall loss. Thus, the method divides by the number of samples and the number of output neurons/features included in the dimensions of y_true - y_pred.
- The gradient computations do not include the number of output neurons. Gradients are only divided by the number of samples (_compute_loss_grad) Overall, this leads to the fact, that the gradient has a wrong scaling by the number of output neurons. As the search direction is still alright, this does not cause too much pain. Still, it should be fixed.
In case this is not clear, I can see that I create a minimal example.
Cheers!
Versions
import platform; print(platform.platform()) Linux-3.16.0-4-amd64-x86_64-with-debian-8.5 import sys; print("Python", sys.version) ('Python', '2.7.9 (default, Mar 1 2015, 12:57:24) \n[GCC 4.9.2]') import numpy; print("NumPy", numpy.version) ('NumPy', '1.10.4') import scipy; print("SciPy", scipy.version) ('SciPy', '0.14.0') import sklearn; print("Scikit-Learn", sklearn.version) ('Scikit-Learn', '0.18.1')