pingcap/tidb

Improve the performance of `Sort` by parallel

Open

#14,417 opened on Jan 9, 2020

View on GitHub
 (1 comment) (0 reactions) (0 assignees)Go (40,090 stars) (6,186 forks)batch import
help wantedsig/executiontype/enhancement

Description

Feature Request

Is your feature request related to a problem? Please describe:

Now the Sort executor is executed serially. We can run it parallelly to speed up.

Describe the feature you'd like:

Considering using a parallel framework, to run each sort partition in parallel. And then using a merge sort to generate the final result. We can refer to this proposal https://github.com/pingcap/tidb/pull/14238#issuecomment-569880893 and reuse it (after this pr merge).

Describe alternatives you've considered:

Task:

  1. Implement parallel sort.
  • Implement merge sort algorithm above Shuffle operator.
  • Implement plan builder for parallel sort.
  1. Cost Model, explain analyze information.
  • Change the cost model of sort if it will run parallelly.
  • Change the explain analyze result for parallel sort.

Teachability, Documentation, Adoption, Migration Strategy:

Contributor guide

Improve the performance of `Sort` by parallel · pingcap/tidb#14417 | Good First Issue