Add work-in-progress for visualizing gradients tutorial (issue #3186) #3389

j-silv · 2025-06-09T21:43:44Z

Description

Add initial draft for visualizing gradients tutorial. Link is here

This write-up starts by discussing the difference between leaf and non-leaf tensors, and the associated requires_grad and retains_grad class attributes.

It then will go through a real-world example of visualizing gradients by using the retains_grad in a more complicated neural network like ResNet (this part is a work-in-progress).

I put the tutorial in the advanced_source directory but perhaps it would be better sorted as an intermediate tutorial or a recipe. Open to suggestions.

What's written so far is how I imagined structuring the tutorial. If you have any comments about the overall flow / material let me know. Feel free to comment on the wordage and tone as well, just know that I plan on revising tutorial as this is just the first go at it.

Checklist

The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
Only one issue is addressed in this pull request
Labels from the issue that this PR is fixing are added to this pull request
No unnecessary issues are included into this pull request.

…h#3186)

pytorch-bot · 2025-06-09T21:43:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3389

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0b9f56a with merge base 06f9c4b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

advanced_source/visualizing_gradients_tutorial.py

sekyondaMeta · 2025-06-10T14:04:40Z

Generally seems to be headed in the right direction in terms of tone and organization from my perspective.
Can you add perquisite knowledge for this.

advanced_source/visualizing_gradients_tutorial.py

soulitzer

Thanks for the working on this tutorial. Overall I'd say though that this section (prior to the actual visualizing gradients part) can be much shorter.

By the end of this tutorial, you will be able to:

Differentiate between leaf and non-leaf tensors

have a diagram from https://github.com/szagoruyko/pytorchviz, point to the leafs

Know when to use\ retain_grad vs. ``require_grad`

"use requires_grad for leaf, use retain_grad for non-leaf"

Still a work in progress, but I significantly reduced the first section and added some helpful images for the computational graph. I also added links for most terms. The WIP section with ResNet I still have to debug. I'm not sure my method for retaining the intermediate gradients is valid. See discussion on pull request.

j-silv · 2025-06-13T00:00:17Z

Thank you for the comments, they were really helpful. Let me know if you think the first section is still too long.

Concerning the "visualizing gradients" section with an actual example, I'm not sure if I'm going about retaining the gradients for intermediate tensors correctly. My thought process was to use a forward hook, call retain_grad() on the output tensor of that module, and then store that output tensor in a list. Later, after calling loss.backward(), I could then pluck out the grad attribute of that tensor and plot it.

Initially I tried using a backward pass hook like register_full_backward_hook() but this didn't work because the ResNet model performs some inplace operations (i.e. ReLU and one += addition) and PyTorch complains about it:

RuntimeError: Output 0 of BackwardHookFunctionBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Function.

I know that I can plot the gradients for the parameters by just looping through the named_parameters() but I would like to also plot the gradients for the intermediate tensors.

If anyone sees a problem with my method let me know. The current state of the code isn't doing what I expected so I still have to debug it.

Add work-in-progress for visualizing gradients tutorial (issue pytorc…

031ba22

…h#3186)

facebook-github-bot added the cla signed label Jun 9, 2025

github-actions bot added advanced docathon-h1-2025 A label for the docathon in H1 2025 hard hard label for docathon tutorial-proposal labels Jun 9, 2025

sekyondaMeta reviewed Jun 10, 2025

View reviewed changes