microsoft/SynapseML

NullPointerException while compute accuracy with ComputeModelStatistics

Open

#736 创建于 2019年11月13日

在 GitHub 查看
 (7 评论) (0 反应) (1 负责人)Scala (5,228 star) (861 fork)batch import
area/documentationbuggood first issuehelp wanted

描述

Version

com.microsoft.ml.spark:mmlspark_2.11:jar:0.18.1
spark= 2.4.3
scala=2.11.12

data (csv with header) https://gist.github.com/ttpro1995/69051647a256af912803c9a16040f43a

download data and save as csv file, put into folder /data/public/HIGGS/higgs.test.predictioncsv

val data = spark.read.option("header","true").option("inferSchema", "true").csv("/data/public/HIGGS/higgs.test.predictioncsv")

Schema

root
 |-- label: double (nullable = true)
 |-- prediction: double (nullable = true)

Code

import com.microsoft.ml.spark.train.ComputeModelStatistics
val metricsCompute = new ComputeModelStatistics().setLabelCol("label").setScoresCol("prediction").setEvaluationMetric("accuracy")

val result_metrics = metricsCompute.transform(data)

Exception

java.lang.NullPointerException
  at org.apache.spark.sql.Column.<init>(Column.scala:135)
  at org.apache.spark.sql.Column$.apply(Column.scala:38)
  at org.apache.spark.sql.functions$.col(functions.scala:90)
  at com.microsoft.ml.spark.train.ComputeModelStatistics.selectAndCastToDF(ComputeModelStatistics.scala:256)
  at com.microsoft.ml.spark.train.ComputeModelStatistics.selectAndCastToRDD(ComputeModelStatistics.scala:265)
  at com.microsoft.ml.spark.train.ComputeModelStatistics.predictionAndLabels$lzycompute$1(ComputeModelStatistics.scala:99)
  at com.microsoft.ml.spark.train.ComputeModelStatistics.predictionAndLabels$1(ComputeModelStatistics.scala:95)
  at com.microsoft.ml.spark.train.ComputeModelStatistics.transform(ComputeModelStatistics.scala:124)
  ... 47 elided

贡献者指南