Skip to content

Commit e2f97a7

Browse files
authored
[LIBCLC][BINDLESS][CUDA] always inline redirection functs (#18699)
These functions at most do some casting, and have effectively zero register overhead at default opt level, therefore there should be no usage circumstance that brings a downside to always inlining. This brings the nvptx libclc image backend in line with the amd one which requires no such changes. amd libclc backend already does the same thing via consistent usage of the _CLC_DECL macro for all functions. Whilst not immediately obvious to the libclc programmer, _CLC_DECL macro calls `__attribute__((always_inline))`. There's a few cases that had low register usage that I've added the `inline` hint to also, being probably overly cautious. --------- Signed-off-by: JackAKirk <[email protected]>
1 parent 2b9a534 commit e2f97a7

File tree

1 file changed

+1225
-661
lines changed
  • libclc/libspirv/lib/ptx-nvidiacl/images

1 file changed

+1225
-661
lines changed

0 commit comments

Comments
 (0)