JuliaGPU/CUDA.jl
GitHub で見るSumming a vector is faster than summing a multidimensional array
Open
#2,764 opened on 2025年4月28日
cuda arraygood first issueperformance
説明
I am not sure if this is an expected behavior, but it seems that sum(vec(A)) could be much faster than sum(A) when A is a multi-dimensional array. Please see https://discourse.julialang.org/t/summing-a-vector-is-faster-than-summing-a-multi-dimensional-array-of-the-same-length-using-cuda/116711 for an MWE. Although this post is from about a year ago, I can reproduce matching results with CUDA v5.7.3.