pingcap/tidb
View on GitHubColumn: make a big slice then divide into sub slices instead of making multi slices in `newColumn`
Open
#32,096 opened on Feb 3, 2022
help wantedtype/enhancementtype/performance
Description
Enhancement
In some our benchmarks, we find that makeslice of newColumn costs some CPUs

In the newColumn, we can see that we make the slice three times, I think we can only make a big slice instead
I write a simple benchmark
func BenchmarkMakeslice(b *testing.B) {
cap := 2
elemLen := 8
n := 0
for i := 0; i < b.N; i++ {
elemBuf := make([]byte, elemLen)
data := make([]byte, 0, cap*elemLen)
nullBitmap := make([]byte, 0, (cap+7)>>3)
n = len(elemBuf) + len(data) + len(nullBitmap) + n
}
}
func BenchmarkMakeslice2(b *testing.B) {
cap := 2
elemLen := 8
n := 0
for i := 0; i < b.N; i++ {
index := cap * elemLen
cap2 := (cap + 7) >> 3
buf := make([]byte, elemLen+index+cap2)
elemBuf := unsafe.Slice(&buf[0], elemLen)
data := unsafe.Slice(&buf[elemLen], index)
nullBitmap := unsafe.Slice(&buf[index+elemLen], cap2)
elemBuf = elemBuf[0:0]
data = data[0:0]
nullBitmap = nullBitmap[0:0]
n = len(elemBuf) + len(data) + len(nullBitmap) + n
}
}
The result is promising:
cpu: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
BenchmarkMakeslice-8 15793832 84.81 ns/op
BenchmarkMakeslice2-8 31883019 37.09 ns/op
Because I think newColumn is frequently used, I guess we can get benefit from this.
BTW, another better way is to use a pool for Column, but seems that we tried this before but failed, maybe we should re-start this again.