Assert that inputs are contiguous #418

alsrgv · 2020-09-18T03:31:21Z

@jaybdub, I've spent last 24 hours figuring out why PyTorch and TensorRT predictions differed, and it ended up being caused by strided tensor that was not accounted for. With this change I'd like to ensure this doesn't happen again.

Can you suggest where should I add tests for it?

jaybdub · 2020-09-18T03:51:04Z

Hi @alsrgv,

Thanks for making this PR. And apologies for the frustration this likely caused.

Since this doesn't fit the module test format, I'd lean towards pytest for this kind of change.

It would probably work to put a PyTest test case in torch2trt/tests/test_contiguous.py.

Best,
John

torch2trt/torch2trt.py

alsrgv · 2020-09-18T04:23:13Z

Test added.

alsrgv · 2020-09-18T21:43:48Z

@jaybdub, any other tests / checks you want me to throw in?

jaybdub · 2020-09-18T21:51:47Z

Just cloned / validated.

Looks good to me!

jaybdub · 2020-09-18T21:54:22Z

Thanks @alsrgv, this should saves some others some frustration!

Feel free to reach out again if you run into any issues / features in the future.

Best,
John

alsrgv · 2020-09-18T22:00:09Z

Thanks! Oh, I do. I'd like add a couple of things:

Each TensorRT layer will have name based on op and arguments (I'll turn tensors into shape). E.g. batchnorm(Tensor[1, 3, 200, 200], epsilon=1e-6)
As we convert layers, I'd like to expose debug mode where each layer will get flagged, so it's easy to see what was missed. A littler nicer version of https://github.com/NVIDIA-AI-IOT/torch2trt/blob/master/torch2trt/torch2trt.py#L282.

We'll send PRs for these as we get to them - do you have any feedback / direction before we start coding?

jaybdub · 2020-09-19T08:55:29Z

These both sound like nice features!

For layer naming implementation I think it may work to wrap the TensorRT network, but I haven't considered fully

class ContextNetwork():

    ...
    
    def add_convolution(self, *args, **kwargs):
        layer = self.network.add_convolution(*args, **kwargs)
        layer.name = self.ctx.method_str + ... # or something similar, with input/outputs shapes, layer count, etc...
        return layer

In this case, we could replace the plain ctx.network and avoid modifying converters manually. Let me know if you've considered this more fully.

As for layer/debug naming convetion:

I'm curious also if it's possible to get the methods module scope, like "model.backbone.conv". I think this may be possible by registering forward hooks on each module (model.named_modules()) to track entry/exit out of scope. Or maybe there is a simpler way.

If so, then maybe for debugging something like the following is possible

<module scope>[<method count in scope>]: <method name>(<method args>)[<trt layer count in method>]: <trt layer>(<trt args>)

For example, the following model which calls normalize which has many TRT layers

class Head():
     def forward(self, x):
         x = F.conv2d(x, ...)
         return F.normalize(x, p=1)

class Model()
     ...
     def forward(self, x):
         ...
         return self.head(x)

Would log

...
model.head[1]: torch.nn.functional.normalize(Tensor[1,2,3], p=1)[0]: add_unary(Tensor[1,2,3], op=tensorrt.UnaryOperation.ABS)
model.head[1]: torch.nn.functional.normalize(Tensor[1,2,3], p=1)[1]: add_elementwise(Tensor[1,2,3], op=tensorrt.UnaryOperation.POW)
...

This reads kind-of long, but I think uniquely identifies the layer in it's context.

For naming perhaps the TensorRT name / args could be omitted:

<module scope>[<method_count>]: <method name>(<method args>)

Let me know if this makes sense to you. Getting the scope seems a bit more challenging to get, so if it's not that important for your use cases, I think it could be omitted, and added later perhaps.

* Assert that inputs are contiguous * Turn non-contiguous tensors into contiguous * Add unit test * Fix tabs

Assert that inputs are contiguous

a21b5ce

jaybdub reviewed Sep 18, 2020

View reviewed changes

torch2trt/torch2trt.py Outdated Show resolved Hide resolved

alsrgv added 3 commits September 17, 2020 20:57

Turn non-contiguous tensors into contiguous

1e98843

Add unit test

b7695ac

Fix tabs

3a73821

jaybdub merged commit b0cc8e7 into NVIDIA-AI-IOT:master Sep 18, 2020

alsrgv deleted the assert_contiguous branch September 18, 2020 21:57

jaybdub pushed a commit that referenced this pull request Jun 28, 2021

Assert that inputs are contiguous (#418)

f087c03

* Assert that inputs are contiguous * Turn non-contiguous tensors into contiguous * Add unit test * Fix tabs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Assert that inputs are contiguous #418

Assert that inputs are contiguous #418

Uh oh!

alsrgv commented Sep 18, 2020 •

edited

Loading

Uh oh!

jaybdub commented Sep 18, 2020

Uh oh!

Uh oh!

alsrgv commented Sep 18, 2020

Uh oh!

alsrgv commented Sep 18, 2020

Uh oh!

jaybdub commented Sep 18, 2020

Uh oh!

jaybdub commented Sep 18, 2020

Uh oh!

alsrgv commented Sep 18, 2020

Uh oh!

jaybdub commented Sep 19, 2020

Uh oh!

Uh oh!

Assert that inputs are contiguous #418

Assert that inputs are contiguous #418

Uh oh!

Conversation

alsrgv commented Sep 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jaybdub commented Sep 18, 2020

Uh oh!

Uh oh!

alsrgv commented Sep 18, 2020

Uh oh!

alsrgv commented Sep 18, 2020

Uh oh!

jaybdub commented Sep 18, 2020

Uh oh!

jaybdub commented Sep 18, 2020

Uh oh!

alsrgv commented Sep 18, 2020

Uh oh!

jaybdub commented Sep 19, 2020

Uh oh!

Uh oh!

alsrgv commented Sep 18, 2020 •

edited

Loading