trinodb/trino

ClassCastException when writing Delta Lake checkpoint for Decimal columns (JSON Numeric vs String)

Open

#28532 opened on Mar 4, 2026

View on GitHub
 (2 comments) (0 reactions) (0 assignees)Java (9,113 stars) (2,678 forks)batch import
delta-lakegood first issue

Description

Description

When Trino attempts to write a Delta Lake checkpoint for a table containing DECIMAL columns, it may fail with a java.lang.ClassCastException: class java.lang.Double cannot be cast to class java.lang.String.

This occurs when the preceding transaction logs (.json files) were authored by Spark, which sometimes serializes DECIMAL statistics as JSON Numbers (e.g., 36.17) rather than JSON Strings (e.g., "36.17"). Trino's CheckpointWriter strictly expects these values to be Strings during the parsing phase.

Steps to Reproduce

Create a Delta table using Spark SQL with a DECIMAL column.

CREATE TABLE delta_part_for_insert (
    order_id BIGINT,
    customer_id BIGINT,
    order_amount DECIMAL(10, 2),
    order_ts TIMESTAMP
) USING delta
PARTITIONED BY (order_ts);

Insert data into the table using Spark. Spark may write the stats field in the JSON commit file as follows:

"stats": "{\"numRecords\":1,\"minValues\":{\"amount\":36.17},\"maxValues\":{\"amount\":36.17},...}"
Note: 36.17 is a JSON Number, not a String.

Perform an operation in Trino that triggers a checkpoint (e.g., multiple INSERT statements until the checkpoint interval is reached).

Trino fails during the finishInsert phase while writing the checkpoint parquet file.

Error Stacktrace

java.lang.ClassCastException: class java.lang.Double cannot be cast to class java.lang.String (java.lang.Double and java.lang.String are in module java.base of loader 'bootstrap')
    at io.trino.plugin.deltalake.transactionlog.DeltaLakeParquetStatisticsUtils.jsonValueToTrinoValue(DeltaLakeParquetStatisticsUtils.java:134)
    at io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointWriter.lambda$preprocessMinMaxValues$21(CheckpointWriter.java:503)
    ...
    at io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointWriter.write(CheckpointWriter.java:169)
    at io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointWriterManager.writeCheckpoint(CheckpointWriterManager.java:170)

Root Cause Analysis

In DeltaLakeParquetStatisticsUtils.jsonValueToTrinoValue, the logic for DecimalType assumes the jsonValue retrieved from the Jackson-parsed map is always a String:

if (type instanceof DecimalType decimalType) {
    BigDecimal decimal = new BigDecimal((String) jsonValue); // <--- Fails here if jsonValue is Double/Integer
    ...
}

While the Delta Lake protocol suggests statistics should be strings, Spark's implementation frequently optimizes small or simple Decimals as numeric types in the JSON log. Trino should be more resilient by handling both String and Number types.

Proposed Fix

Modify jsonValueToTrinoValue to use String.valueOf(jsonValue) or perform an explicit type check before casting:

if (type instanceof DecimalType decimalType) {
    String stringValue = jsonValue instanceof String ? (String) jsonValue : String.valueOf(jsonValue);
    BigDecimal decimal = new BigDecimal(stringValue);
    ...
}

Affected Versions

Tested on Trino 475 (and current master).

Contributor guide