fix: Java Kernel data skipping uses case-sensitive column matching
#6247 opened on Mar 11, 2026
Description
Description
Delta column names are case-insensitive per the protocol spec ("All column names must be unique regardless of casing"). Delta Spark uses equalsIgnoreCase when resolving predicate column references against the table schema in the data skipping path (via findNestedFieldIgnoreCase).
However, Java Kernel's StatsSchemaHelper uses case-sensitive matching. The Column class uses Arrays.equals(names, other.getNames()) for equality, and the HashMap lookups in StatsSchemaHelper.getLogicalToPhysicalColumnAndDataType() are therefore case-sensitive. This means a predicate like col > 5 will fail to match a schema column named Col, and data skipping will not be applied.
Steps to reproduce
- Create a Delta table with a column named
Value(mixed case) - Query with a predicate using a differently-cased column name, e.g.,
value > 100 - Data skipping will not be applied because the column lookup fails
Expected behavior
Case-insensitive column matching in the data skipping path, consistent with Delta Spark which uses equalsIgnoreCase in findNestedFieldIgnoreCase.
Relevant code
kernel-api/src/main/java/io/delta/kernel/internal/skipping/StatsSchemaHelper.java— builds column maps using exact field names, HashMap lookups are case-sensitivekernel/expressions/Column.java—equals()usesArrays.equals(names, other.getNames())(case-sensitive)
References
- Delta Rust Kernel fix: https://github.com/delta-io/delta-kernel-rs/pull/2055
- Delta Spark implementation:
findNestedFieldIgnoreCaseinSchemaUtils.scalausesequalsIgnoreCase