pola-rs/polars

scan_delta file skip predicate not working for bool dtype

Open

#26290 opened on Jan 26, 2026

View on GitHub
 (7 comments) (1 reaction) (0 assignees)Rust (38,496 stars) (2,826 forks)batch import
A-io-deltaA-io-parquetP-lowenhancementgood first issueperformancepythonupstream issue

Description

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

tmp_path = "./tmp"
df = pl.DataFrame(
    {
        "p": [10, 10, 20, 20],
        "a": [1, 2, 3, None],
        "b": [False, False, True, None]
    }
)

df.write_delta(
    tmp_path,
    delta_write_options={"partition_by": "p"},
)

expr = pl.col.a.is_null()
out = pl.scan_delta(tmp_path).filter(expr).collect()

# filter works
expr = pl.col.b.is_null()
out = pl.scan_delta(tmp_path).filter(expr).collect()

# filter does not work
expr = pl.col.b == pl.lit(False)
out = pl.scan_delta(tmp_path).filter(expr).collect()

Log output

$ POLARS_VERBOSE=1 pp issue_bool_mre.py 2>&1  | grep skipping
initialize_scan_predicate: Predicate pushdown allows skipping 1 / 2 files
initialize_scan_predicate: Predicate pushdown allows skipping 1 / 2 files
initialize_scan_predicate: Predicate pushdown allows skipping 0 / 2 files

Issue description

Equality on booleans is not supported in predicate pushdown of delta files.

Expected behavior

No clear reason why it would not be supported. Snapshot of the delta json file with statistics:

{"add":{"path":"p=10/part-00000-059a7e15-37b4-450c-acf9-c52cb10d4c59-c000.snappy.parquet","partitionValues":{"p":"10"},"size":688,"modificationTime":1769434619583,"dataChange":true,"stats":"{\"numRecords\":2,\"minValues\":{\"a\":1,\"b\":false},\"maxValues\":{\"b\":false,\"a\":2},\"nullCount\":{\"a\":0,\"b\":0}}","tags":null,"baseRowId":null,"defaultRowCommitVersion":null,"clusteringProvider":null}}

{"add":{"path":"p=20/part-00000-d0d5a29b-6f86-44d5-ae3a-794d33c73da2-c000.snappy.parquet","partitionValues":{"p":"20"},"size":677,"modificationTime":1769434619583,"dataChange":true,"stats":"{\"numRecords\":2,\"minValues\":{\"b\":true,\"a\":3},\"maxValues\":{\"b\":true,\"a\":3},\"nullCount\":{\"b\":1,\"a\":1}}","tags":null,"baseRowId":null,"defaultRowCommitVersion":null,"clusteringProvider":null}}

To be confirmed.

Installed versions

Latest main (803b8e4cb1), post 1.37.1

Contributor guide