cube-js/cube

Cube crashes when storing pre-aggregation with duckdb (type `hugeInt`)

Open

#7,127 opened on Sep 12, 2023

View on GitHub
 (7 comments) (0 reactions) (0 assignees)Rust (19,563 stars) (1,965 forks)batch import
driver:duckdbhelp wantedpre-aggregations

Description

Describe the bug

Cubestore does not support the HugeInt type used by duckdb

I have first encountered the bug with the on-prem cube, but I have been able to reproduce it on the cube cloud. Both the cube files and printscreens are comming from the cloud version.

When using Cube with the duckdb driver, Cube fails at creating (external) pre-aggregation

To Reproduce Steps to reproduce the behavior :

  1. Create parquet file on s3 containing at least one integer column
  2. Connect cube to the s3 account
  3. Create a schema where one measure is a sum over the integer column
  4. Add the measure a preaggregation
  5. The cube will fail when trying to create the pre-aggregation table with a hugeInt column to store the data.

I have also created a small cube definition showcasing the bug.

Expected behavior The driver should be casting the hugeInt to the corresponding MySql type

Screenshots Schema : image

Playground : image

Pre-aggregation error : image

Stack trace

Error: Error during create table: CREATE TABLE prod_pre_aggregations.spam_main_w1hqvvny_l2pn4q2e_1ig1fr6 (`spam__org` varchar(255), `spam__metric` HUGEINT) LOCATION ?: Custom type 'HUGEINT' is not supported
    at WebSocket.<anonymous> (/cube/node_modules/@cubejs-backend/cubestore-driver/src/WebSocketConnection.ts:121:30)
    at WebSocket.emit (node:events:513:28)
    at Receiver.receiverOnMessage (/cube/node_modules/ws/lib/websocket.js:1008:20)
    at Receiver.emit (node:events:513:28)
    at Receiver.dataMessage (/cube/node_modules/ws/lib/receiver.js:502:14)
    at Receiver.getData (/cube/node_modules/ws/lib/receiver.js:435:17)
    at Receiver.startLoop (/cube/node_modules/ws/lib/receiver.js:143:22)
    at Receiver._write (/cube/node_modules/ws/lib/receiver.js:78:10)
    at writeOrBuffer (node:internal/streams/writable:391:12)
    at _write (node:internal/streams/writable:332:10)
    at Receiver.Writable.write (node:internal/streams/writable:336:10)
    at TLSSocket.socketOnData (/cube/node_modules/ws/lib/websocket.js:1102:35)
    at TLSSocket.emit (node:events:513:28)
    at addChunk (node:internal/streams/readable:315:12)
    at readableAddChunk (node:internal/streams/readable:289:9)
    at TLSSocket.Readable.push (node:internal/streams/readable:228:10)
    at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)

Minimally reproducible Cube Schema

cubes: 
  - name: spam
    pre_aggregations:
      - name: main
        dimensions:
          - CUBE.org
        measures:
          - CUBE.metric

    sql: SELECT 'foo' AS org, 10::hugeInt AS metric
    dimensions:
      - name: org
        sql: org
        type: string
        primary_key: true
    measures:
        - name: metric
          sql: metric
          type: sum

Version: v0.33.53

Additional context Dockerfile used to run the Cube

version: "3.9"

services:

  cube:
    network_mode: host
    image: cubejs/cube:v0.33.53
    environment:
      - CUBEJS_DB_TYPE=duckdb
      - CUBEJS_DB_DUCKDB_S3_ACCESS_KEY_ID=<REDACTED>
      - CUBEJS_DB_DUCKDB_S3_SECRET_ACCESS_KEY=<REDACTED>
      - CUBEJS_DB_DUCKDB_S3_ENDPOINT=s3.ca-central-1.amazonaws.com
      - CUBEJS_DB_DUCKDB_S3_REGION=ca-central-1
      - CUBEJS_API_SECRET=<REDACTED>
      - CUBEJS_CONCURRENCY=8
      - CUBEJS_DEV_MODE=true

    volumes:
      - .:/cube/conf

Contributor guide