JuliaGPU/CUDA.jl

Inconsistency between CUDA.jl and Base for sum!

Open

#2,506 opened on Sep 25, 2024

View on GitHub
 (1 comment) (0 reactions) (0 assignees)Julia (1,408 stars) (274 forks)batch import
bugcuda arraygood first issue

Description

Hi, thanks again for putting CUDA.jl together!

I found that the return type of the sum! function can be different between Array and CuArray: the Array return type is the same as the left argument while the CuArray return type retains the singleton dimension.

using CUDA
X = rand(Float32, (50, 50, 5));
Y = similar(X, (50,50));
res = sum!(Y, X);
size(res) # (50,50)

X_d = CuArray(X);
Y_d = CuArray(Y);
res_d = sum!(Y_d, X_d);
size(res_d) # (50,50,1)

I would expect to get the same type as Y_d in this case.

Version info

Julia Version 1.10.5
Commit 6f3fdf7b36 (2024-08-27 14:19 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 16 × 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, rocketlake)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)

CUDA runtime 12.6, artifact installation
CUDA driver 12.4
NVIDIA driver 551.61.0

CUDA libraries:
- CUBLAS: 12.6.1
- CURAND: 10.3.7
- CUFFT: 11.2.6
- CUSOLVER: 11.6.4
- CUSPARSE: 12.5.3
- CUPTI: 2024.3.1 (API 24.0.0)
- NVML: 12.0.0+551.61

Julia packages:
- CUDA: 5.5.1
- CUDA_Driver_jll: 0.10.2+0
- CUDA_Runtime_jll: 0.15.2+0

Toolchain:
- Julia: 1.10.5
- LLVM: 15.0.7

1 device:
  0: NVIDIA RTX A5000 (sm_86, 21.757 GiB / 23.988 GiB available)

Contributor guide