pola-rs/polars
Ver no GitHubscan_delta file skip predicate not working for bool dtype
Open
#26.290 aberto em 26 de jan. de 2026
A-io-deltaA-io-parquetP-lowenhancementgood first issueperformancepythonupstream issue
Métricas do repositório
- Stars
- (38.496 stars)
- Métricas de merge de PR
- (Mesclagem média 3d 18h) (175 fundiu PRs em 30d)
Description
Checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of Polars.
Reproducible example
import polars as pl
tmp_path = "./tmp"
df = pl.DataFrame(
{
"p": [10, 10, 20, 20],
"a": [1, 2, 3, None],
"b": [False, False, True, None]
}
)
df.write_delta(
tmp_path,
delta_write_options={"partition_by": "p"},
)
expr = pl.col.a.is_null()
out = pl.scan_delta(tmp_path).filter(expr).collect()
# filter works
expr = pl.col.b.is_null()
out = pl.scan_delta(tmp_path).filter(expr).collect()
# filter does not work
expr = pl.col.b == pl.lit(False)
out = pl.scan_delta(tmp_path).filter(expr).collect()
Log output
$ POLARS_VERBOSE=1 pp issue_bool_mre.py 2>&1 | grep skipping
initialize_scan_predicate: Predicate pushdown allows skipping 1 / 2 files
initialize_scan_predicate: Predicate pushdown allows skipping 1 / 2 files
initialize_scan_predicate: Predicate pushdown allows skipping 0 / 2 files
Issue description
Equality on booleans is not supported in predicate pushdown of delta files.
Expected behavior
No clear reason why it would not be supported. Snapshot of the delta json file with statistics:
{"add":{"path":"p=10/part-00000-059a7e15-37b4-450c-acf9-c52cb10d4c59-c000.snappy.parquet","partitionValues":{"p":"10"},"size":688,"modificationTime":1769434619583,"dataChange":true,"stats":"{\"numRecords\":2,\"minValues\":{\"a\":1,\"b\":false},\"maxValues\":{\"b\":false,\"a\":2},\"nullCount\":{\"a\":0,\"b\":0}}","tags":null,"baseRowId":null,"defaultRowCommitVersion":null,"clusteringProvider":null}}
{"add":{"path":"p=20/part-00000-d0d5a29b-6f86-44d5-ae3a-794d33c73da2-c000.snappy.parquet","partitionValues":{"p":"20"},"size":677,"modificationTime":1769434619583,"dataChange":true,"stats":"{\"numRecords\":2,\"minValues\":{\"b\":true,\"a\":3},\"maxValues\":{\"b\":true,\"a\":3},\"nullCount\":{\"b\":1,\"a\":1}}","tags":null,"baseRowId":null,"defaultRowCommitVersion":null,"clusteringProvider":null}}
To be confirmed.
Installed versions
Latest main (803b8e4cb1), post 1.37.1