lancedb/lancedb

Feature: to_lance method in asynchronous table python api

Open

#1.387 geöffnet am 19. Juni 2024

Auf GitHub ansehen
 (1 Kommentar) (1 Reaktion) (0 zugewiesene Personen)HTML (10.303 Stars) (876 Forks)batch import
Rustenhancementgood first issue

Beschreibung

SDK

Python

Description

Currently, in python api, "to_lance" method that returns a pyarrow dataset is only supported in synchronous table. I was wondering if "to_lance" method can be exposed for asynchronous tables as well.

For using third party tools like duckdb, pyarrow dataset is preferred over pyarrow tables as "Arrow Datasets allow for full column selection and filter pushdowns, unlike Arrow tables which are eagerly loaded into memory."

Since the newly introduced lance v2 is supported only in asynchronous tables , and since asynchronous tables do not have a to_lance method, we are not able to evaluate features like filter pushdowns, when using lance v2 tables against duckdb.

Contributor Guide