llvm/llvm-project

OCUDU benchmarks notably slower on clang22 than gcc16 on avx512 target

Open

#198106 opened on May 16, 2026

View on GitHub
 (8 comments) (1 reaction) (1 assignee)C++ (26,378 stars) (10,782 forks)batch import
good first issueperformance

Description

Reported here: https://www.phoronix.com/review/gcc-16-vs-clang-22/2

Running the OCUDU benchmarks on a Threadripper 9980X (-march=znver5) is almost 50% faster when built with gcc16 vs clang22: https://openbenchmarking.org/innhold/92454b6c98c3ad8beb08a5084a278386c6b06a46

This might be a good first issue for somebody with access to both compilers and an interest in codegen profiling:

1 - build and run the benchmarks on gcc16 and clang22 - preferably on a avx512 machine, but I suspect older machines will show a perf difference as well.

2 - profile to identify any hot code sections on either build and compare the builds and code quality (e.g. missing compiler flags/attributes? vectorisation width? performant use of particular vector instructions? use of slow gather/scatter instructions?)

3 - identify any missing llvm optimisations and raise suitable issue(s)

Contributor guide