fix: Java Kernel data skipping uses case-sensitive column matching · delta-io/delta#6247

Repository metrics

Stars: (8,807 stars)
PR merge metrics: (Avg merge 7d 1h) (142 merged PRs in 30d)

Description

Delta column names are case-insensitive per the protocol spec ("All column names must be unique regardless of casing"). Delta Spark uses equalsIgnoreCase when resolving predicate column references against the table schema in the data skipping path (via findNestedFieldIgnoreCase).

However, Java Kernel's StatsSchemaHelper uses case-sensitive matching. The Column class uses Arrays.equals(names, other.getNames()) for equality, and the HashMap lookups in StatsSchemaHelper.getLogicalToPhysicalColumnAndDataType() are therefore case-sensitive. This means a predicate like col > 5 will fail to match a schema column named Col, and data skipping will not be applied.

Steps to reproduce

Create a Delta table with a column named Value (mixed case)
Query with a predicate using a differently-cased column name, e.g., value > 100
Data skipping will not be applied because the column lookup fails

Expected behavior

Case-insensitive column matching in the data skipping path, consistent with Delta Spark which uses equalsIgnoreCase in findNestedFieldIgnoreCase.

Relevant code

kernel-api/src/main/java/io/delta/kernel/internal/skipping/StatsSchemaHelper.java — builds column maps using exact field names, HashMap lookups are case-sensitive
kernel/expressions/Column.java — equals() uses Arrays.equals(names, other.getNames()) (case-sensitive)

References

Delta Rust Kernel fix: https://github.com/delta-io/delta-kernel-rs/pull/2055
Delta Spark implementation: findNestedFieldIgnoreCase in SchemaUtils.scala uses equalsIgnoreCase

Contributor guide

Research direction: Inspect the failing test and modify `Column.java` and `StatsSchemaHelper.java` to use case insensitive column matching.
Tech stack: java
Domain: backenddata
Issue type: Bug
Difficulty: 2
Estimated time: 1-3 hours
Activity status: Active
Clarity: Clear
Prerequisites: GitJava
Newbie friendliness: 75

Repository metrics

Description

Description

Steps to reproduce

Expected behavior

Relevant code

References

Contributor guide

Get fresh easy issues in your inbox.