scikit-learn-contrib/category_encoders

Handle missing in one hot encoder

Open

#400 opened on Mar 12, 2023

View on GitHub
 (3 comments) (0 reactions) (0 assignees)Python (2,322 stars) (397 forks)batch import
buggood first issue

Description

Expected Behavior

Currently, handle_missing=value adds a new column although the documentation says 'value' will encode a new value as 0 in every dummy column. Furthermore, we need a test for this

Actual Behavior

adds a column instead of using all 0

Steps to Reproduce the Problem

from category_encoders import OneHotEncoder
import pandas as pd

he = OneHotEncoder(handle_missing="value")

data = [("foo", 1), ("bar", 2), (None, 6)]
data = pd.DataFrame(data, columns=["c1", "c2"])
print(he.fit_transform(data))

Specifications

  • Version: 2.6
  • Platform: linux

Contributor guide

Handle missing in one hot encoder · scikit-learn-contrib/category_encoders#400 | Good First Issue