apache/beam

[Task]: Create a script to train sklearn model for IT test.

Open

#24,903 建立於 2023年1月5日

在 GitHub 查看
 (19 留言) (0 反應) (4 負責人)Java (7,313 star) (4,097 fork)batch import
P2good first issuepythonrun-inferencetask

描述

What needs to happen?

Sklearn doesn't offer backward compatibility wrt models on newer versions. In the Sklearn IT tests, we use models trained manually and this can get outdated when there is an update to sklearn.

So to tackle this, we need to create a script which trains the sklearn models on the data and then publish this model to a GCS bucket, once this is done we can use this model to run Sklearn IT test.

Issue Priority

Priority: 2 (default / most normal work should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

貢獻者指南

[Task]: Create a script to train sklearn model for IT test. · apache/beam#24903 | Good First Issue