scikit-learn/scikit-learn

Bug in bfgs gradient computation of MLPRegressor with multiple output neurons

Open

#8,349 创建于 2017年2月13日

在 GitHub 查看
 (2 评论) (0 反应) (0 负责人)Python (66,084 star) (27,020 fork)batch import
Bughelp wantedmodule:neural_network

描述

When implementing a special Neural Network based on MLPRegressor, I found the following problem when using bfgs training and multiple output neurons (I did not look into the other training methods):

  • The 'squared_loss' implementation uses np.mean to compute the overall loss. Thus, the method divides by the number of samples and the number of output neurons/features included in the dimensions of y_true - y_pred.
  • The gradient computations do not include the number of output neurons. Gradients are only divided by the number of samples (_compute_loss_grad) Overall, this leads to the fact, that the gradient has a wrong scaling by the number of output neurons. As the search direction is still alright, this does not cause too much pain. Still, it should be fixed.

In case this is not clear, I can see that I create a minimal example.

Cheers!

Versions

import platform; print(platform.platform()) Linux-3.16.0-4-amd64-x86_64-with-debian-8.5 import sys; print("Python", sys.version) ('Python', '2.7.9 (default, Mar 1 2015, 12:57:24) \n[GCC 4.9.2]') import numpy; print("NumPy", numpy.version) ('NumPy', '1.10.4') import scipy; print("SciPy", scipy.version) ('SciPy', '0.14.0') import sklearn; print("Scikit-Learn", sklearn.version) ('Scikit-Learn', '0.18.1')

贡献者指南