feat: add to_sedonadb() method · apache/sedona#2511

(1 comment) (0 reactions) (0 assignees)Scala (693 forks)batch import

help wanted

Repository metrics

It would be nice to have an interface that converts a SedonaSpark DataFrame to a SedonaDB DataFrame easily. Here is a current solution that works:

import sedona.db
sd = sedona.db.connect()

df = sd.create_data_frame(dataframe_to_arrow(spark_df))

This could be nice:

spark_df.to_sedonadb()

But maybe we'd have to do this:

spark_df.to_sedonadb(sd)

This would allow for cool spatial workflows, like this:

Read an Iceberg table with SedonaSpark and perform big data operations with a filtering operation at the end to make the data small enough to fit on a single machine
Convert the SedonaSpark DataFrame to SedonaDB
Use a library that's compatible with SedonaDB, like lonboard, to create a graph

Let me know what you think!

調査方針: SedonaDB Pythonクライアントを調べて、Arrowテーブルからデータフレームを作成する方法を理解し、次にSedonaSpark DataFrameクラスにメソッドを追加する方法（モンキーパッチやコードへの貢献）を調査してください。
技術スタック: pythonscala
領域: backenddata
Issue 種別: 機能
難度: 2
推定時間: 半日
活動状況: アクティブ
明確さ: 明確
前提条件: PythonApache SparkSedona
初心者向け度: 70