rapidsai/cudf

Use grid stride in CSV reader kernels

Open

#14,066 opened on Sep 8, 2023

View on GitHub
 (1 comment) (0 reactions) (1 assignee)C++ (6,000 stars) (735 forks)batch import
PerformancecuIOgood first issue

Description

Currently, the CSV reader parses data using a thread per row, and a separate thread is used for each row, regardless of the file size. Using a grid stride loop would allow kernels to launch with preset number of blocks even with large input.

This applies both to the parser and the data inference kernels.

Contributor guide