rapidsai/cudf

[FEA] Expand ORC and Parquet benchmarks to cover different stripe/rowgroup sizes

Open

#10100 opened on Jan 21, 2022

View on GitHub
 (5 comments) (0 reactions) (0 assignees)C++ (6,000 stars) (735 forks)batch import
PerformancecuIOfeature requestgood first issuelibcudftests

Description

Add a set of benchmarks with varying stripe/rowgroup sizes to each affected component:

  • ORC reader
  • ORC writer
  • Parquet reader
  • Parquet writer

Use the new benchmarks to evaluate the effects of these options and potentially determine the optimal settings.

Contributor guide