You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR creates the bindings for `FusionExecutorCache`, allowing fusions
to run CUDA kernels in `nvfuser_direct`.
Functions bound for `FusionExecutorCache`:
* get_cuda_kernel
* get_most_recent_scheduled_ir
* get_scheduled_ir
* is_compiled
* execute
Create `python/python_direct/direct_utils.h` for `python_direct`-only
helper functions
* Add `from_pyiterable` and `to_tensor_vector` to and from `at::Tensor`
and `KernelArgumentHolder`
New function for python `FusionDefinition`:
* `execute` - It creates FusionExecutorCache if it exists and runs the
fusion with given input arguments.
Testing
* Created `test_fusion_execution_cache` and `test_define_tensor`
PR Stack:
#4409 Create python FusionDefinition for nvfuser_next
#4513 Add bindings for FusionExecutorCache **<<< This PR.**
#4516 Add the remaining binary operations
#4517 Add the bindings for unary operations
#4518 Add the bindings for reduction operations
#4519 Move helper functions from python_frontend to python_common
#4520 Create python reproducer from Fusion IR for nvfuser_direct
#4521 Recreate python_frontend test_basic for nvfuser_direct
0 commit comments