Case Insensitivity on MLeap Models · combust/mleap#713

仓库指标

Star: (1,461 star)
PR 合并指标: (30 天内没有已合并 PR)

描述

By default, if you train a PySparkML model with a dataframe that has uppercase column names, and then try to run an inference with the same column names but in lowercase, the prediction will fail. Is there a parameter or way to set case insensitivity on inference?

I see this checking for a strict vs relaxed select of the leapframe which I assume is what I'm looking for. ~~How can I set that when serializing a PySpark Model to an MLeap Bundle?~~

Thanks!

Edit: I see that my second question was wrong - It comes from the transform function, not embedded into the Bundle itself. So when I call model.transform(frame) is there documentation on how to pass in the relaxedSelect option?

Edit: It seems like I'm incorrect on what relaxedSelect does. It seems to just "not throw an error on columns that don't exist" instead of being case insensitive. Is there a case insensitivity option?

贡献者指南

研究方向: 检查提到的行（mleap executor/src/main/scala/ml/combust/mleap/executor/Transform.scala）中的 Transform.scala 文件，以理解 relaxedSelect 参数。研究是否可以在推理期间为列名匹配添加不区分大小写的选项。查找 MLeap 捆绑格式中任何现有的属性或配置。由于没有维护者回应，贡献者应在实施前在 issue 中提出解决方案。
技术栈: scalapython
领域: backendmachine learning
议题类型: 功能
难度: 3
预计时间: 1-2 天
活动状态: 活跃
清晰度: 清晰
前置要求: ScalaPySparkMLeap
新手友好度: 20

仓库指标

描述

贡献者指南

每天在邮箱收到新鲜 Easy issues。