pola-rs/polars
GitHub で見るscan_delta file skip predicate not working for bool dtype
Open
#26,290 opened on 2026年1月26日
A-io-deltaA-io-parquetP-lowenhancementgood first issueperformancepythonupstream issue
説明
Checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of Polars.
Reproducible example
import polars as pl
tmp_path = "./tmp"
df = pl.DataFrame(
{
"p": [10, 10, 20, 20],
"a": [1, 2, 3, None],
"b": [False, False, True, None]
}
)
df.write_delta(
tmp_path,
delta_write_options={"partition_by": "p"},
)
expr = pl.col.a.is_null()
out = pl.scan_delta(tmp_path).filter(expr).collect()
# filter works
expr = pl.col.b.is_null()
out = pl.scan_delta(tmp_path).filter(expr).collect()
# filter does not work
expr = pl.col.b == pl.lit(False)
out = pl.scan_delta(tmp_path).filter(expr).collect()
Log output
$ POLARS_VERBOSE=1 pp issue_bool_mre.py 2>&1 | grep skipping
initialize_scan_predicate: Predicate pushdown allows skipping 1 / 2 files
initialize_scan_predicate: Predicate pushdown allows skipping 1 / 2 files
initialize_scan_predicate: Predicate pushdown allows skipping 0 / 2 files
Issue description
Equality on booleans is not supported in predicate pushdown of delta files.
Expected behavior
No clear reason why it would not be supported. Snapshot of the delta json file with statistics:
{"add":{"path":"p=10/part-00000-059a7e15-37b4-450c-acf9-c52cb10d4c59-c000.snappy.parquet","partitionValues":{"p":"10"},"size":688,"modificationTime":1769434619583,"dataChange":true,"stats":"{\"numRecords\":2,\"minValues\":{\"a\":1,\"b\":false},\"maxValues\":{\"b\":false,\"a\":2},\"nullCount\":{\"a\":0,\"b\":0}}","tags":null,"baseRowId":null,"defaultRowCommitVersion":null,"clusteringProvider":null}}
{"add":{"path":"p=20/part-00000-d0d5a29b-6f86-44d5-ae3a-794d33c73da2-c000.snappy.parquet","partitionValues":{"p":"20"},"size":677,"modificationTime":1769434619583,"dataChange":true,"stats":"{\"numRecords\":2,\"minValues\":{\"b\":true,\"a\":3},\"maxValues\":{\"b\":true,\"a\":3},\"nullCount\":{\"b\":1,\"a\":1}}","tags":null,"baseRowId":null,"defaultRowCommitVersion":null,"clusteringProvider":null}}
To be confirmed.
Installed versions
Latest main (803b8e4cb1), post 1.37.1