lancedb/lancedb
Auf GitHub ansehenFeature: to_lance method in asynchronous table python api
Open
#1.387 geöffnet am 19. Juni 2024
Rustenhancementgood first issue
Beschreibung
SDK
Python
Description
Currently, in python api, "to_lance" method that returns a pyarrow dataset is only supported in synchronous table. I was wondering if "to_lance" method can be exposed for asynchronous tables as well.
For using third party tools like duckdb, pyarrow dataset is preferred over pyarrow tables as "Arrow Datasets allow for full column selection and filter pushdowns, unlike Arrow tables which are eagerly loaded into memory."
Since the newly introduced lance v2 is supported only in asynchronous tables , and since asynchronous tables do not have a to_lance method, we are not able to evaluate features like filter pushdowns, when using lance v2 tables against duckdb.