Closed
Description
Hi Luke,
Thank you for the awesome work. I tried running EfficientNet-B0 on my GTX 1070 (8GB RAM) with an input batch of dimension [44x1x256x256] (single channel image) and I am running into 'CUDA out of memory' (with the model in 'training' mode).
I tried running another implementation and wasn't getting this issue, and after digging in the code, it seems as if the implementation for MBConv (or the re-iteration of MBConv) was too memory hungry.
I really like your implementation of EfficientNet and if I did have more time, I would definitely have a deeper dive into your code. At the mean time, if possible, could you help me check this issue out (maybe it'll speed up training in the future?) ? Thank you!
Metadata
Metadata
Assignees
Labels
No labels