[clang][X86] Wrong result for __builtin_elementwise_fma on _Float16

Godbolt: https://godbolt.org/z/Ydj17K17b

Clang uses single-precision FMA to emulate half-precision FMA, what is wrong as it doesn't have enough precision.

Example, round to even: 0x1.400p+8 * 0x1.008p+7 + 0x1.000p-24
Precise result: 0x1.40a0000002p+15
Half-precision FMA: 0x1.40cp+15
Single-precision FMA: 0x1.40a000p+15
(clang) Single-precision FMA -> half-precision: 0x1.408p+15

Another example: 0x1.eb8p-12 * 0x1.9p-11 - 0x1p-11

To produce correct result single-precision multiplication, then double-precision addition seems to be enough.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[clang][X86] Wrong result for __builtin_elementwise_fma on _Float16 #128450

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[clang][X86] Wrong result for __builtin_elementwise_fma on _Float16 #128450

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions