feathr-ai/feathr
GitHub で見る[BUG] get_offline_feature ignores `parquet` output file option
Open
#716 opened on 2022年9月29日
buggood first issue
説明
Willingness to contribute
No. I cannot contribute a bug fix at this time.
Feathr version
0.8.0
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 20.0): both on Linux Ubuntu 20 and Databricks
- Python version: 3.10
- Spark version, if reporting runtime issue:
Describe the problem
get_offline_feature always write into avro regardless of the execution config.
Tracking information
No response
Code to reproduce bug
Run:
get_offline_feature(
execution_configurations=SparkExecutionConfiguration({
"spark.feathr.inputFormat": "parquet",
"spark.feathr.outputFormat": "parquet",
}),
....
)
still write file as avro
What component(s) does this bug affect?
-
Python Client: This is the client users use to interact with most of our API. Mostly written in Python. -
Computation Engine: The computation engine that execute the actual feature join and generation work. Mostly in Scala and Spark. -
Feature Registry API: The frontend API layer supports SQL, Purview(Atlas) as storage. The API layer is in Python(FAST API) -
Feature Registry Web UI: The Web UI for feature registry. Written in React