scikit-learn-contrib/category_encoders

Error handling in inverse_transform is broken

Closed

#190 创建于 2019年5月7日

在 GitHub 查看
 (0 评论) (0 反应) (0 负责人)Python (2,322 star) (397 fork)batch import
bughelp wanted

描述

Inverse_transform should ideally handle absence of the columns dropped because of drop_invariant=True. But if it is not possible, inverse_transform() should at least return the correct error message instead of just crashing.

Example in form of a parameterized unit test:

    def test_inverse_wrong_feature_count_wit_drop_invariant(self):
        x = [['A', 'B', 'C'], ['D', 'E', 'C'], ['F', 'G', 'C']]  # the last column is constant 
        for encoder_name in {'BaseNEncoder', 'BinaryEncoder', 'OrdinalEncoder', 'OneHotEncoder'}:
            with self.subTest(encoder_name=encoder_name):
                enc = getattr(encoders, encoder_name)(drop_invariant=True)
                transformed = enc.fit_transform(x)

                # run inverse_transform() and check the raised exception text
                with self.assertRaises(ValueError) as cm:
                    enc.inverse_transform(transformed)
                self.assertTrue(str(cm.exception).startswith('Unexpected input dimension'))

贡献者指南

Error handling in inverse_transform is broken · scikit-learn-contrib/category_encoders#190 | Good First Issue