JuliaGPU/CUDA.jl
在 GitHub 查看Summing a vector is faster than summing a multidimensional array
Open
#2,764 创建于 2025年4月28日
cuda arraygood first issueperformance
描述
I am not sure if this is an expected behavior, but it seems that sum(vec(A)) could be much faster than sum(A) when A is a multi-dimensional array. Please see https://discourse.julialang.org/t/summing-a-vector-is-faster-than-summing-a-multi-dimensional-array-of-the-same-length-using-cuda/116711 for an MWE. Although this post is from about a year ago, I can reproduce matching results with CUDA v5.7.3.