Respect `min` and `max` of inputs to create more precise repro scripts #4535

crcrpar · 2025-05-28T19:45:47Z

Related: #4529

When an nvfuser fusion definition takes signed int tensors as inputs, it's possible that those int tensors have a solid value range to sample from e.g. [0, vocab_size), for embedding, the FusionDefinition.last_repro_script seems to have lost such range information as it uses FusionDefinition.fake_inputs that is generated by fakifying the input tensors, leading to a loss of actual values.

Inspired by lightning-thunder's ExampleInputMetaData, I rather store the range info in a dataclass that can give us a string of make_tensor or equivalents to get tensors of the same value range.

For the same fusion definition as #4529, running the fusion definition as follows

inputs = [
    torch.randint(0, 10, (16,), dtype=torch.int32, device='cuda:0'),
    torch.testing.make_tensor((151936, 2048), dtype=torch.bfloat16, device='cuda:0'),
    torch.testing.make_tensor((2048,), dtype=torch.bfloat16, device='cuda:0'),
    torch.testing.make_tensor((2560, 2048), dtype=torch.bfloat16, device='cuda:0'),
    torch.testing.make_tensor((2560,), dtype=torch.bfloat16, device='cuda:0'),
]
fd.execute(inputs, save_repro_inputs=True, print_repro=True)

both fd.last_repro_script() and the fd.execute return

inputs = [
    torch.testing.make_tensor((16,), dtype=torch.int32, device="cuda:0", low=0, high=7,),
    torch.testing.make_tensor((151936, 2048), dtype=torch.bfloat16, device="cuda:0", low=-9.0, high=8.9375,),
    torch.testing.make_tensor((2048,), dtype=torch.bfloat16, device="cuda:0", low=-9.0, high=8.9375,),
    torch.testing.make_tensor((2560, 2048), dtype=torch.bfloat16, device="cuda:0", low=-9.0, high=8.9375,),
    torch.testing.make_tensor((2560,), dtype=torch.bfloat16, device="cuda:0", low=-9.0, high=8.9375,),
]

Copilot

Pull Request Overview

This PR improves the reproducibility of fusion definitions by preserving the original input range information instead of using generic fake inputs. The key changes include:

Addition of the new dataclass InputReproducer to capture tensor metadata (min/max, shape, strides, etc.).
Replacement of fake_inputs with last_input_reproducers in the execute and last_repro_script methods.
Updated type annotations and diffs in the repro_script_for logic to utilize the new dataclass.

Copilot · 2025-05-28T19:46:13Z

python/nvfuser/__init__.py

+    is_contiguous: bool
+
+    def __init__(self, tensor: torch.Tensor) -> None:
+        if type(tensor) is not torch.Tensor:


Consider using isinstance(tensor, torch.Tensor) instead of comparing types directly to better support tensor subclasses.

Suggested change

if type(tensor) is not torch.Tensor:

if not isinstance(tensor, torch.Tensor):

Here I want to make sure tensor is compatible with torch.aminmax. What I really want to filter out is tensor subclasses that we cannot reconstruct with torch.testing.make_tensor, torch.rand, or torch.randint. Thus in short isinstance wouldn't be sufficing

kiya00 · 2025-05-30T10:38:38Z

python/nvfuser/__init__.py

-
-            fake_mode = FakeTensorMode()
-            self.fake_inputs = [fake_mode.from_tensor(inp) for inp in inputs]
+            self.last_input_reproducers = [InputReproducer(tensor) for tensor in inputs]


Do we also need additional information, such as requires_grad or storage_offset, for example? In Thunder, when this API is used for reporting, it also generates a corresponding Torch operator function for the NVFuser region. To ensure correctness, it constructs input tensors for this function that include properties like requires_grad and storage_offset (which were previously retrieved from FakeTensor).

I chose not to remove fake_inputs and to add last_input_reproducers with storage_offset and requires_grad

Signed-off-by: Masaki Kozuki <[email protected]>

crcrpar requested a review from Copilot May 28, 2025 19:45

Copilot AI reviewed May 28, 2025

View reviewed changes

crcrpar force-pushed the store-value-ranges-for-last-repro-script branch from 0a8a727 to 4873dd2 Compare May 29, 2025 09:00

kiya00 reviewed May 30, 2025

View reviewed changes

crcrpar added 8 commits June 2, 2025 22:24

Store value range of input tensors to save

9ddfb4a

Signed-off-by: Masaki Kozuki <[email protected]>

cast torch.Tensor to InputReproducer

0194a3f

Signed-off-by: Masaki Kozuki <[email protected]>

TODO about the support of tensor wrapper subclasses

1eece75

Signed-off-by: Masaki Kozuki <[email protected]>

convert torch.device to its double quotes enclosed string

750c2d7

Signed-off-by: Masaki Kozuki <[email protected]>

format

5c7949b

Signed-off-by: Masaki Kozuki <[email protected]>

Make input value range more deterministic

923a263

Signed-off-by: Masaki Kozuki <[email protected]>

Store fake_inputs, use storage_offset and requires_grad

1a1109b

Signed-off-by: Masaki Kozuki <[email protected]>

format

6472dca

Signed-off-by: Masaki Kozuki <[email protected]>

crcrpar force-pushed the store-value-ranges-for-last-repro-script branch from bb17536 to 6472dca Compare June 2, 2025 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Respect `min` and `max` of inputs to create more precise repro scripts #4535

Respect `min` and `max` of inputs to create more precise repro scripts #4535

Uh oh!

crcrpar commented May 28, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 28, 2025

Uh oh!

crcrpar May 28, 2025 •

edited

Loading

Uh oh!

kiya00 May 30, 2025

Uh oh!

crcrpar Jun 2, 2025

Uh oh!

Uh oh!

	if type(tensor) is not torch.Tensor:
	if not isinstance(tensor, torch.Tensor):

Respect min and max of inputs to create more precise repro scripts #4535

Are you sure you want to change the base?

Respect min and max of inputs to create more precise repro scripts #4535

Uh oh!

Conversation

crcrpar commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI May 28, 2025

Choose a reason for hiding this comment

Uh oh!

crcrpar May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kiya00 May 30, 2025

Choose a reason for hiding this comment

Uh oh!

crcrpar Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Respect `min` and `max` of inputs to create more precise repro scripts #4535

Respect `min` and `max` of inputs to create more precise repro scripts #4535

crcrpar commented May 28, 2025 •

edited

Loading

crcrpar May 28, 2025 •

edited

Loading