pingcap/tidb

Refine the implementation of `HashJoinExecutor`

Open

#14,466 建立於 2020年1月14日

在 GitHub 查看
 (4 留言) (0 反應) (0 負責人)Go (40,090 star) (6,186 fork)batch import
help wantedsig/executiontype/enhancement

描述

Feature Request

Is your feature request related to a problem? Please describe:

Now the implementation of TiDB's hash-join executor is rough and after profiling, we found there are some issues can be resolved to get better performance: image

  • not necessary to initialize and reset the baseJoiner.chk if no condition #14902
  • the method joinMatchedProbeSideRow2Chunk uses the method NewIterator4Slice to create a lot of iterators whereas these iterators can be reused. #15423
  • some slices created in GetMatchRowsAndPtrs can be reused. #15423
  • we can try to change the return type of entryStore.get from structs to pointers which may reduce memory copy(but it may cause some GC problems so we need some tests after update).
  • we can try to build the hash table in parallel.

If you have any other ideas, welcome to discuss it with us in our SIG-Exec channel or in this issue by using comments.

貢獻者指南