pingcap/tidb

br: open format file generation by backup

Open

#58611 opened on Dec 30, 2024

View on GitHub
 (0 comments) (0 reactions) (0 assignees)Go (40,090 stars) (6,186 forks)batch import
help wantedtype/feature-request

Description

Feature Request

Is your feature request related to a problem? Please describe:

Currently, there is no native tidb tool to export tikv data into open format files, like parquet. Instead, tidb users need to use client like tispark etc to extract data and do the format conversion. And the long tech stack suffers from bad performance.

Describe the feature you'd like:

tidb can provide a native way to dump snapshot data and incremental data to open format files. A preferred way is to let backup to generate open format files directly, in other words, backup can support to generate either log/sst files or parquet files. A simple prototype code is here https://github.com/BornChanger/sampleParquet.

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

Contributor guide