Torch 2.7 and CUDA 12.8 compile failing

I need to compile for RTX 5000 series

I am able to compile Flash Attention, Sage Attention, xFormers and many more in this setup but pytorch3d fails like below


full logs as txt : 


[full logs.txt](https://github.com/user-attachments/files/19724798/full.logs.txt)


```
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/__tuple_dir/vector_types.h(88): error: expected a ">"
  template <> struct tuple_size<unsigned short1> : ::cuda::std::__4::integral_constant<size_t, 1> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short1> { static_assert(_Ip < 1,"tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short2> : ::cuda::std::__4::integral_constant<size_t, 2> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short2> { static_assert(_Ip < 2, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short3> : ::cuda::std::__4::integral_constant<size_t, 3> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short3> { static_assert(_Ip < 3, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short4> : ::cuda::std::__4::integral_constant<size_t, 4> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short4> { static_assert(_Ip < 4, "tuple_element index out of range"); using type = unsigned short; };
                                         ^
```

```
16 errors detected in the compilation of "C:/f/pytorch3d/pytorch3d/csrc/pulsar/gpu/renderer.backward.gpu.cu".
renderer.backward.gpu.cu
```

```
FAILED: C:/f/pytorch3d/build/temp.win-amd64-cpython-310/Release/f/pytorch3d/pytorch3d/csrc/pulsar/gpu/renderer.calc_signature.gpu.obj

FAILED: C:/f/pytorch3d/build/temp.win-amd64-cpython-310/Release/f/pytorch3d/pytorch3d/csrc/pulsar/gpu/renderer.construct.gpu.obj

FAILED: C:/f/pytorch3d/build/temp.win-amd64-cpython-310/Release/f/pytorch3d/pytorch3d/csrc/pulsar/gpu/renderer.calc_signature.gpu.obj

```




```
Microsoft Windows [Version 10.0.26100.3775]
(c) Microsoft Corporation. All rights reserved.

C:\f\venv310\Scripts>activate

(venv310) C:\f\venv310\Scripts>cd..

(venv310) C:\f\venv310>cd..

(venv310) C:\f>git clone --recurse-submodules https://github.com/facebookresearch/pytorch3d
Cloning into 'pytorch3d'...
remote: Enumerating objects: 14967, done.
remote: Counting objects: 100% (2529/2529), done.
remote: Compressing objects: 100% (294/294), done.
remote: Total 14967 (delta 2312), reused 2241 (delta 2235), pack-reused 12438 (from 2)
Receiving objects: 100% (14967/14967), 51.43 MiB | 35.34 MiB/s, done.
Resolving deltas: 100% (10407/10407), done.

(venv310) C:\f>cd pytorch3d

(venv310) C:\f\pytorch3d>python setup.py build_ext bdist_wheel
C:\f\pytorch3d\setup.py:84: UserWarning: The environment variable `CUB_HOME` was not found. NVIDIA CUB is required for compilation and can be downloaded from `https://github.com/NVIDIA/cub/releases`. You can unpack it to a location of your choice and set the environment variable `CUB_HOME` to the folder containing the `CMakeListst.txt` file.
  warnings.warn(
running build_ext
building 'pytorch3d._C' extension
creating C:\f\pytorch3d\build
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\ball_query
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\blending
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\compositing
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\face_areas_normals
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\gather_scatter
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\interp_face_attrs
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\iou_box3d
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\knn
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\marching_cubes
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\mesh_normal_consistency
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\packed_to_padded_tensor
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\point_mesh
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\points_to_volumes
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar\gpu
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar\host
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar\pytorch
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\rasterize_coarse
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\rasterize_meshes
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\rasterize_points
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\sample_farthest_points
creating C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\sample_pdf
C:\f\venv310\lib\site-packages\torch\utils\cpp_extension.py:2330: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
  warnings.warn(
Emitting ninja build file C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/67] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar\gpu\renderer.backward.gpu.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -IC:\f\pytorch3d\pytorch3d\csrc -IC:\f\venv310\lib\site-packages\torch\include -IC:\f\venv310\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IC:\f\venv310\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\f\pytorch3d\pytorch3d\csrc\pulsar\gpu\renderer.backward.gpu.cu -o C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar\gpu\renderer.backward.gpu.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
FAILED: C:/f/pytorch3d/build/temp.win-amd64-cpython-310/Release/f/pytorch3d/pytorch3d/csrc/pulsar/gpu/renderer.backward.gpu.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar\gpu\renderer.backward.gpu.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -IC:\f\pytorch3d\pytorch3d\csrc -IC:\f\venv310\lib\site-packages\torch\include -IC:\f\venv310\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IC:\f\venv310\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\f\pytorch3d\pytorch3d\csrc\pulsar\gpu\renderer.backward.gpu.cu -o C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\pulsar\gpu\renderer.backward.gpu.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/__tuple_dir/vector_types.h(88): error: expected a ">"
  template <> struct tuple_size<unsigned short1> : ::cuda::std::__4::integral_constant<size_t, 1> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short1> { static_assert(_Ip < 1, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short2> : ::cuda::std::__4::integral_constant<size_t, 2> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short2> { static_assert(_Ip < 2, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short3> : ::cuda::std::__4::integral_constant<size_t, 3> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short3> { static_assert(_Ip < 3, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short4> : ::cuda::std::__4::integral_constant<size_t, 4> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short4> { static_assert(_Ip < 4, "tuple_element index out of range"); using type = unsigned short; };
                                         ^

C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/__tuple_dir/vector_types.h(88): error: expected a ">"
  template <> struct tuple_size<unsigned short1> : ::cuda::std::__4::integral_constant<size_t, 1> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short1> { static_assert(_Ip < 1, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short2> : ::cuda::std::__4::integral_constant<size_t, 2> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short2> { static_assert(_Ip < 2, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short3> : ::cuda::std::__4::integral_constant<size_t, 3> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short3> { static_assert(_Ip < 3, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short4> : ::cuda::std::__4::integral_constant<size_t, 4> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short4> { static_assert(_Ip < 4, "tuple_element index out of range"); using type = unsigned short; };
                                                                                                                                                               ^

C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/__tuple_dir/vector_types.h(88): error: expected a ">"
  template <> struct tuple_size<unsigned short1> : ::cuda::std::__4::integral_constant<size_t, 1> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short1> { static_assert(_Ip < 1, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short2> : ::cuda::std::__4::integral_constant<size_t, 2> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short2> { static_assert(_Ip < 2, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short3> : ::cuda::std::__4::integral_constant<size_t, 3> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short3> { static_assert(_Ip < 3, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short4> : ::cuda::std::__4::integral_constant<size_t, 4> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short4> { static_assert(_Ip < 4, "tuple_element index out of range"); using type = unsigned short; };
                                                                                                                                                                                                                                                                                                            ^

C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/__tuple_dir/vector_types.h(88): error: expected a ">"
  template <> struct tuple_size<unsigned short1> : ::cuda::std::__4::integral_constant<size_t, 1> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short1> { static_assert(_Ip < 1, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short2> : ::cuda::std::__4::integral_constant<size_t, 2> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short2> { static_assert(_Ip < 2, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short3> : ::cuda::std::__4::integral_constant<size_t, 3> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short3> { static_assert(_Ip < 3, "tuple_element index out of range"); using type = unsigned short; }; template <> struct tuple_size<unsigned short4> : ::cuda::std::__4::integral_constant<size_t, 4> {}; template <size_t _Ip> struct tuple_element<_Ip, unsigned short4> { static_assert(_Ip < 4, "tuple_element index out of range"); using type = unsigned short; };
       

  .
.
.
.
.
.
.
.
.                                                                                                                                                                                                                                                                                                                                                                                                                      

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

point_mesh_cuda.cu
tmpxft_0000d230_00000000-7_point_mesh_cuda.compute_86.cudafe1.cpp
[32/67] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\points_to_volumes\points_to_volumes.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -IC:\f\pytorch3d\pytorch3d\csrc -IC:\f\venv310\lib\site-packages\torch\include -IC:\f\venv310\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IC:\f\venv310\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\f\pytorch3d\pytorch3d\csrc\points_to_volumes\points_to_volumes.cu -o C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\points_to_volumes\points_to_volumes.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/detail/libcxx/include/cmath(1032): warning #221-D: floating-point value does not fit in required floating-point type
    if (__r >= ::nextafter(static_cast<_RealT>(_MaxVal), ((float)(1e+300))))
                                                          ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

points_to_volumes.cu
tmpxft_0000d4bc_00000000-7_points_to_volumes.compute_86.cudafe1.cpp
[33/67] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\iou_box3d\iou_box3d.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -IC:\f\pytorch3d\pytorch3d\csrc -IC:\f\venv310\lib\site-packages\torch\include -IC:\f\venv310\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IC:\f\venv310\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\f\pytorch3d\pytorch3d\csrc\iou_box3d\iou_box3d.cu -o C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\iou_box3d\iou_box3d.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/detail/libcxx/include/cmath(1032): warning #221-D: floating-point value does not fit in required floating-point type
    if (__r >= ::nextafter(static_cast<_RealT>(_MaxVal), ((float)(1e+300))))
                                                          ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

iou_box3d.cu
tmpxft_0000ac74_00000000-7_iou_box3d.compute_86.cudafe1.cpp
[34/67] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\knn\knn.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -IC:\f\pytorch3d\pytorch3d\csrc -IC:\f\venv310\lib\site-packages\torch\include -IC:\f\venv310\lib\site-packages\torch\include\torch\csrc\api\include "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\include" -IC:\f\venv310\include -IC:\Python310\include -IC:\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.26100.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\f\pytorch3d\pytorch3d\csrc\knn\knn.cu -o C:\f\pytorch3d\build\temp.win-amd64-cpython-310\Release\f\pytorch3d\pytorch3d\csrc\knn\knn.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_120,code=sm_120 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8/include\cuda/std/detail/libcxx/include/cmath(1032): warning #221-D: floating-point value does not fit in required floating-point type
    if (__r >= ::nextafter(static_cast<_RealT>(_MaxVal), ((float)(1e+300))))
                                                          ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

knn.cu
tmpxft_0000cd18_00000000-7_knn.compute_86.cudafe1.cpp
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "C:\f\venv310\lib\site-packages\torch\utils\cpp_extension.py", line 2480, in _run_ninja_build
    subprocess.run(
  File "C:\Python310\lib\subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\f\pytorch3d\setup.py", line 144, in <module>
    setup(
  File "C:\f\venv310\lib\site-packages\setuptools\__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
    return run_commands(dist)
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
    dist.run_commands()
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\dist.py", line 968, in run_commands
    self.run_command(cmd)
  File "C:\f\venv310\lib\site-packages\setuptools\dist.py", line 1217, in run_command
    super().run_command(command)
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
    cmd_obj.run()
  File "C:\f\venv310\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
    _build_ext.run(self)
  File "C:\f\venv310\lib\site-packages\Cython\Distutils\old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 346, in run
    self.build_extensions()
  File "C:\f\venv310\lib\site-packages\torch\utils\cpp_extension.py", line 1007, in build_extensions
    build_ext.build_extensions(self)
  File "C:\f\venv310\lib\site-packages\Cython\Distutils\old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 466, in build_extensions
    self._build_extensions_serial()
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 492, in _build_extensions_serial
    self.build_extension(ext)
  File "C:\f\venv310\lib\site-packages\setuptools\command\build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "C:\f\venv310\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 547, in build_extension
    objects = self.compiler.compile(
  File "C:\f\venv310\lib\site-packages\torch\utils\cpp_extension.py", line 975, in win_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "C:\f\venv310\lib\site-packages\torch\utils\cpp_extension.py", line 2133, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "C:\f\venv310\lib\site-packages\torch\utils\cpp_extension.py", line 2496, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

(venv310) C:\f\pytorch3d>
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Torch 2.7 and CUDA 12.8 compile failing #1970

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Torch 2.7 and CUDA 12.8 compile failing #1970

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions