delta-io/delta

fix: Java Kernel data skipping uses case-sensitive column matching

Open

#6247 opened on Mar 11, 2026

View on GitHub
 (2 comments) (0 reactions) (0 assignees)Scala (8,807 stars) (2,100 forks)batch import
buggood first issue

Description

Description

Delta column names are case-insensitive per the protocol spec ("All column names must be unique regardless of casing"). Delta Spark uses equalsIgnoreCase when resolving predicate column references against the table schema in the data skipping path (via findNestedFieldIgnoreCase).

However, Java Kernel's StatsSchemaHelper uses case-sensitive matching. The Column class uses Arrays.equals(names, other.getNames()) for equality, and the HashMap lookups in StatsSchemaHelper.getLogicalToPhysicalColumnAndDataType() are therefore case-sensitive. This means a predicate like col > 5 will fail to match a schema column named Col, and data skipping will not be applied.

Steps to reproduce

  1. Create a Delta table with a column named Value (mixed case)
  2. Query with a predicate using a differently-cased column name, e.g., value > 100
  3. Data skipping will not be applied because the column lookup fails

Expected behavior

Case-insensitive column matching in the data skipping path, consistent with Delta Spark which uses equalsIgnoreCase in findNestedFieldIgnoreCase.

Relevant code

  • kernel-api/src/main/java/io/delta/kernel/internal/skipping/StatsSchemaHelper.java — builds column maps using exact field names, HashMap lookups are case-sensitive
  • kernel/expressions/Column.javaequals() uses Arrays.equals(names, other.getNames()) (case-sensitive)

References

Contributor guide