pingcap/tidb

Improve the performance of `Sort` by parallel

Open

#14,417 建立於 2020年1月9日

在 GitHub 查看
 (1 留言) (0 反應) (0 負責人)Go (40,090 star) (6,186 fork)batch import
help wantedsig/executiontype/enhancement

描述

Feature Request

Is your feature request related to a problem? Please describe:

Now the Sort executor is executed serially. We can run it parallelly to speed up.

Describe the feature you'd like:

Considering using a parallel framework, to run each sort partition in parallel. And then using a merge sort to generate the final result. We can refer to this proposal https://github.com/pingcap/tidb/pull/14238#issuecomment-569880893 and reuse it (after this pr merge).

Describe alternatives you've considered:

Task:

  1. Implement parallel sort.
  • Implement merge sort algorithm above Shuffle operator.
  • Implement plan builder for parallel sort.
  1. Cost Model, explain analyze information.
  • Change the cost model of sort if it will run parallelly.
  • Change the explain analyze result for parallel sort.

Teachability, Documentation, Adoption, Migration Strategy:

貢獻者指南