debug on KJT issue #2058

TroyGarden · 2024-05-29T22:34:49Z

Summary:

context

In the IR export workflow, the module takes KJTs as input and produces an ExportedProgram of the module
KJT actually has a variable length for the values and weights
This dynamic nature of KJT needs to be explicitly passed to torch.export

changes

add a util function to mark the input KJT's dynamic shape
add in the test of how to correctly specify the dynamics shapes for the input KJT

results

input KJTs with different value lengths

(Pdb) feature1.values()
tensor([0, 1, 2, 3, 2, 3])
(Pdb) feature2.values()
tensor([0, 1, 2, 3, 2, 3, 4])

exported_program can take those input KJTs

(Pdb) ep.module()(feature1)
[tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16],
        [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15,
         -1.4368e-15, -1.4368e-15, -1.4368e-15],
        [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15,
         -1.4368e-15, -1.4368e-15, -1.4368e-15]])]
(Pdb) ep.module()(feature2)
[tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16],
        [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15,
         -1.4368e-15, -1.4368e-15, -1.4368e-15],
        [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15,
         -1.4368e-15, -1.4368e-15, -1.4368e-15]])]

deserialized module can take those input KJTs

(Pdb) deserialized_model(feature1)
[tensor([[ 0.2630,  0.1473, -0.3691,  0.2261],
        [ 0.0000,  0.0000,  0.0000,  0.0000]],
       grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121,  0.1998, -0.0384, -0.2458, -0.6844,  0.8741],
        [ 0.1313,  0.2968, -0.2979, -0.2150, -0.2593,  0.6758,  1.0010,  0.9052]],
       grad_fn=<SplitWithSizesBackward0>)]
(Pdb) deserialized_model(feature2)
[tensor([[ 0.2630,  0.1473, -0.3691,  0.2261],
        [ 0.0000,  0.0000,  0.0000,  0.0000]],
       grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121,  0.1998, -0.0384, -0.2458, -0.6844,  0.8741],
        [ 0.1313,  0.2968, -0.2979, -0.2150, -0.9359,  0.1123,  0.5834, -0.1357]],
       grad_fn=<SplitWithSizesBackward0>)]

Differential Revision: D57824907

facebook-github-bot · 2024-05-29T22:34:56Z

This pull request was exported from Phabricator. Differential Revision: D57824907

Summary: # context * In the IR export workflow, the module takes KJTs as input and produces an `ExportedProgram` of the module * KJT actually has a variable length for the values and weights * This dynamic nature of KJT needs to be explicitly passed to torch.export # changes * add a util function to mark the input KJT's dynamic shape * add in the test of how to correctly specify the dynamics shapes for the input KJT # results * input KJTs with different value lengths ``` (Pdb) feature1.values() tensor([0, 1, 2, 3, 2, 3]) (Pdb) feature2.values() tensor([0, 1, 2, 3, 2, 3, 4]) ``` * exported_program can take those input KJTs ``` (Pdb) ep.module()(feature1) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] (Pdb) ep.module()(feature2) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] ``` * deserialized module can take those input KJTs ``` (Pdb) deserialized_model(feature1) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.2593, 0.6758, 1.0010, 0.9052]], grad_fn=<SplitWithSizesBackward0>)] (Pdb) deserialized_model(feature2) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.9359, 0.1123, 0.5834, -0.1357]], grad_fn=<SplitWithSizesBackward0>)] ``` Differential Revision: D57824907

facebook-github-bot · 2024-05-29T22:35:57Z

This pull request was exported from Phabricator. Differential Revision: D57824907

Summary: # context * In the IR export workflow, the module takes KJTs as input and produces an `ExportedProgram` of the module * KJT actually has a variable length for the values and weights * This dynamic nature of KJT needs to be explicitly passed to torch.export # changes * add a util function to mark the input KJT's dynamic shape * add in the test of how to correctly specify the dynamics shapes for the input KJT # results * input KJTs with different value lengths ``` (Pdb) feature1.values() tensor([0, 1, 2, 3, 2, 3]) (Pdb) feature2.values() tensor([0, 1, 2, 3, 2, 3, 4]) ``` * exported_program can take those input KJTs ``` (Pdb) ep.module()(feature1) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] (Pdb) ep.module()(feature2) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] ``` * deserialized module can take those input KJTs ``` (Pdb) deserialized_model(feature1) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.2593, 0.6758, 1.0010, 0.9052]], grad_fn=<SplitWithSizesBackward0>)] (Pdb) deserialized_model(feature2) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.9359, 0.1123, 0.5834, -0.1357]], grad_fn=<SplitWithSizesBackward0>)] ``` Differential Revision: D57824907

facebook-github-bot · 2024-05-29T23:16:41Z

This pull request was exported from Phabricator. Differential Revision: D57824907

Summary: # context * In the IR export workflow, the module takes KJTs as input and produces an `ExportedProgram` of the module * KJT actually has a variable length for the values and weights * This dynamic nature of KJT needs to be explicitly passed to torch.export # changes * add a util function to mark the input KJT's dynamic shape * add in the test of how to correctly specify the dynamics shapes for the input KJT # results * input KJTs with different value lengths ``` (Pdb) feature1.values() tensor([0, 1, 2, 3, 2, 3]) (Pdb) feature2.values() tensor([0, 1, 2, 3, 2, 3, 4]) ``` * exported_program can take those input KJTs ``` (Pdb) ep.module()(feature1) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] (Pdb) ep.module()(feature2) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] ``` * deserialized module can take those input KJTs ``` (Pdb) deserialized_model(feature1) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.2593, 0.6758, 1.0010, 0.9052]], grad_fn=<SplitWithSizesBackward0>)] (Pdb) deserialized_model(feature2) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.9359, 0.1123, 0.5834, -0.1357]], grad_fn=<SplitWithSizesBackward0>)] ``` Differential Revision: D57824907

facebook-github-bot · 2024-05-30T02:05:03Z

This pull request was exported from Phabricator. Differential Revision: D57824907

Summary: # context * In the IR export workflow, the module takes KJTs as input and produces an `ExportedProgram` of the module * KJT actually has a variable length for the values and weights * This dynamic nature of KJT needs to be explicitly passed to torch.export # changes * add a util function to mark the input KJT's dynamic shape * add in the test of how to correctly specify the dynamics shapes for the input KJT # results * input KJTs with different value lengths ``` (Pdb) feature1.values() tensor([0, 1, 2, 3, 2, 3]) (Pdb) feature2.values() tensor([0, 1, 2, 3, 2, 3, 4]) ``` * exported_program can take those input KJTs ``` (Pdb) ep.module()(feature1) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] (Pdb) ep.module()(feature2) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] ``` * deserialized module can take those input KJTs ``` (Pdb) deserialized_model(feature1) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.2593, 0.6758, 1.0010, 0.9052]], grad_fn=<SplitWithSizesBackward0>)] (Pdb) deserialized_model(feature2) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.9359, 0.1123, 0.5834, -0.1357]], grad_fn=<SplitWithSizesBackward0>)] ``` Differential Revision: D57824907

Summary: # context * In the IR export workflow, the module takes KJTs as input and produces an `ExportedProgram` of the module * KJT actually has a variable length for the values and weights * This dynamic nature of KJT needs to be explicitly passed to torch.export # changes * add a util function to mark the input KJT's dynamic shape * add in the test of how to correctly specify the dynamics shapes for the input KJT # results * input KJTs with different value lengths ``` (Pdb) feature1.values() tensor([0, 1, 2, 3, 2, 3]) (Pdb) feature2.values() tensor([0, 1, 2, 3, 2, 3, 4]) ``` * exported_program can take those input KJTs ``` (Pdb) ep.module()(feature1) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] (Pdb) ep.module()(feature2) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] ``` * deserialized module can take those input KJTs ``` (Pdb) deserialized_model(feature1) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.2593, 0.6758, 1.0010, 0.9052]], grad_fn=<SplitWithSizesBackward0>)] (Pdb) deserialized_model(feature2) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.9359, 0.1123, 0.5834, -0.1357]], grad_fn=<SplitWithSizesBackward0>)] ``` Reviewed By: PaulZhang12 Differential Revision: D57824907

facebook-github-bot · 2024-06-03T16:52:53Z

This pull request was exported from Phabricator. Differential Revision: D57824907

Summary: Pull Request resolved: #2058 # context * In the IR export workflow, the module takes KJTs as input and produces an `ExportedProgram` of the module * KJT actually has a variable length for the values and weights * This dynamic nature of KJT needs to be explicitly passed to torch.export # changes * add a util function to mark the input KJT's dynamic shape * add in the test of how to correctly specify the dynamics shapes for the input KJT # results * input KJTs with different value lengths ``` (Pdb) feature1.values() tensor([0, 1, 2, 3, 2, 3]) (Pdb) feature2.values() tensor([0, 1, 2, 3, 2, 3, 4]) ``` * exported_program can take those input KJTs ``` (Pdb) ep.module()(feature1) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] (Pdb) ep.module()(feature2) [tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16]]), tensor([[-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15], [-2.8735e-16, -2.8735e-16, -2.8735e-16, -2.8735e-16, -1.4368e-15, -1.4368e-15, -1.4368e-15, -1.4368e-15]])] ``` * deserialized module can take those input KJTs ``` (Pdb) deserialized_model(feature1) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.2593, 0.6758, 1.0010, 0.9052]], grad_fn=<SplitWithSizesBackward0>)] (Pdb) deserialized_model(feature2) [tensor([[ 0.2630, 0.1473, -0.3691, 0.2261], [ 0.0000, 0.0000, 0.0000, 0.0000]], grad_fn=<SplitWithSizesBackward0>), tensor([[ 0.2198, -0.1648, -0.0121, 0.1998, -0.0384, -0.2458, -0.6844, 0.8741], [ 0.1313, 0.2968, -0.2979, -0.2150, -0.9359, 0.1123, 0.5834, -0.1357]], grad_fn=<SplitWithSizesBackward0>)] ``` Reviewed By: PaulZhang12 Differential Revision: D57824907 fbshipit-source-id: 615f602314e6517dba37e83eea5066de5950dc42

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 29, 2024

facebook-github-bot added the fb-exported label May 29, 2024

TroyGarden force-pushed the export-D57824907 branch from 360f8f3 to f5bbdb1 Compare May 29, 2024 22:35

TroyGarden force-pushed the export-D57824907 branch from f5bbdb1 to 6acc122 Compare May 29, 2024 23:16

TroyGarden force-pushed the export-D57824907 branch from 6acc122 to cb5d996 Compare May 30, 2024 02:04

TroyGarden force-pushed the export-D57824907 branch from cb5d996 to 2535136 Compare June 3, 2024 16:52

facebook-github-bot closed this in cd470f8 Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

debug on KJT issue #2058

debug on KJT issue #2058

Uh oh!

TroyGarden commented May 29, 2024

Uh oh!

facebook-github-bot commented May 29, 2024

Uh oh!

facebook-github-bot commented May 29, 2024

Uh oh!

facebook-github-bot commented May 29, 2024

Uh oh!

facebook-github-bot commented May 30, 2024

Uh oh!

facebook-github-bot commented Jun 3, 2024

Uh oh!

Uh oh!

debug on KJT issue #2058

debug on KJT issue #2058

Uh oh!

Conversation

TroyGarden commented May 29, 2024

context

changes

results

Uh oh!

facebook-github-bot commented May 29, 2024

Uh oh!

facebook-github-bot commented May 29, 2024

Uh oh!

facebook-github-bot commented May 29, 2024

Uh oh!

facebook-github-bot commented May 30, 2024

Uh oh!

facebook-github-bot commented Jun 3, 2024

Uh oh!

Uh oh!