scikit-learn/scikit-learn

Allow NaNs for the target values in TransformedTargetRegressor

Open

#11,339 opened on Jun 21, 2018

View on GitHub
 (12 comments) (0 reactions) (0 assignees)Python (66,084 stars) (27,020 forks)batch import
help wantedmodule:compose

Description

Description

One potential use case for TransformedTargetRegressor is to get rid of missing values in the target. but currently initial check of the fit method doesn't allow such array.

Steps/Code to Reproduce

Example:

from sklearn.compose import TransformedTargetRegressor
from sklearn.impute import SimpleImputer
from sklearn.tree import DecisionTreeRegressor
from sklearn import datasets

X, y = datasets.load_linnerud(return_X_y=True)

## put some NaN in y
y[5, 1] = np.NaN

estimator = TransformedTargetRegressor(
    regressor=DecisionTreeRegressor(),
    func = lambda _y: SimpleImputer().fit_transform(_y), # becuse SimpleImputer doesnt have inverse
    inverse_func = lambda _y: _y,
    check_inverse = False
)

estimator.fit(X, y)

This raises: ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Contributor guide