dotnet/runtime

Subtract-variant of `VectorXXX.MultiplyAddEstimate` does not light up for negated constant local

Open

#121,301 创建于 2025年11月3日

在 GitHub 查看
 (4 评论) (0 反应) (0 负责人)C# (17,886 star) (5,445 fork)batch import
area-CodeGen-coreclrhelp wantedtenet-performance

描述

Description

public static Vector128<double> BadCode(Vector128<double> a) 
{
    Vector128<double> half = Vector128.Create(0.5);
    return Vector128.MultiplyAddEstimate(a, Vector128.Create(2.0), -half) * half;
}

public static Vector128<double> GoodCode(Vector128<double> a, Vector128<double> half) 
{
    // Fma.MultiplyAdd lits up too!
    return Vector128.MultiplyAddEstimate(a, Vector128.Create(2.0), -half) * half;
}

Regression?

No

Data

// coreclr trunk-20251102+e8812e7419db9137f20b990786a53ed71e27e11e

C:BadCode(System.Runtime.Intrinsics.Vector128`1[double]):System.Runtime.Intrinsics.Vector128`1[double] (FullOpts):
       vmovddup xmm0, qword ptr [reloc @RWD00]
       vmovddup xmm1, qword ptr [reloc @RWD08]
       vmovaps  xmm2, xmmword ptr [rsp+0x08]
       vfmadd213pd xmm2, xmm1, xmmword ptr [reloc @RWD16]
       vmulpd   xmm0, xmm2, xmm0
       vmovups  xmmword ptr [rdi], xmm0
       mov      rax, rdi
       ret      
RWD00  	dq	3FE0000000000000h
RWD08  	dq	4000000000000000h
RWD16  	dq	BFE0000000000000h, BFE0000000000000h

C:GoodCode(System.Runtime.Intrinsics.Vector128`1[double],System.Runtime.Intrinsics.Vector128`1[double]):System.Runtime.Intrinsics.Vector128`1[double] (FullOpts):
       vmovups  xmm0, xmmword ptr [rsp+0x18]
       vmovaps  xmm1, xmmword ptr [rsp+0x08]
       vfmsub132pd xmm1, xmm0, xmmword ptr [reloc @RWD00]
       vmulpd   xmm0, xmm1, xmm0
       vmovups  xmmword ptr [rdi], xmm0
       mov      rax, rdi
       ret      
RWD00  	dq	4000000000000000h, 4000000000000000h

Analysis

Turning locals into constant data seems to take priority over negated FMA.

贡献者指南