Skip to content

[XRay] Weird sled behavior with -O3 and -fno-inline #144681

Open
@Thyre

Description

@Thyre

Godbolt link: https://godbolt.org/z/KvoMf5GjY


Given this very short example:

#include <math.h>

inline int SQRT(int arg) { return sqrtf(static_cast<float>(arg)); }

template<typename T>
T foo( T a )
{
    return SQRT( (T)a );
}

int main( int argc, char** argv )
{
    return foo( argc );
}

clang generates interesting assembly code with XRay being involved, and the flags -O3 -fno-inline -fxray-instrument -fxray-instruction-threshold=1 being used:

main:
        nop     word ptr [rax + rax + 512]
        nop     word ptr [rax + rax + 512]
        jmp     int foo<int>(int)

int foo<int>(int):
        nop     word ptr [rax + rax + 512]
        nop     word ptr [rax + rax + 512]
        jmp     SQRT(int)

SQRT(int):
        nop     word ptr [rax + rax + 512]
        [...]
        ret
        nop     word ptr cs:[rax + rax + 512]

Both main and int foo<int>(int) have proper sleds for XRay instrumentation. However, both the enter and exit sled can be found before the actual function content (i.e. the jmp instruction).

This causes an issue for tools who want to represent the a proper tree structure of functions being called, e.g. performance tools. One would see something like this:

- ./a.out
  - main
  - int foo<int>(int)
  - SQRT(int)

Instead of

- ./a.out
  - main
    - int foo<int>(int)
      - SQRT(int)

In the case of LULESH with our current (in-development) XRay instrumentation adapter in Score-P, this even caused an inconsistent profile, probably due to similar reasons.

Given that this is a very constructed case, I don't see this as being a huge issue. However, I think this may be a limitation that should be documented somewhere. I can't immediately think of a solution for this, and I think most people will not encounter this issue. Why would someone prevent inlining with -O3 in the first place? (well, me, because I wanted to test the overhead when filtering functions).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions