Memory-efficient Diffusion Transformers with Quanto and Diffusers #9011

sayakpaul · 2024-07-30T04:05:25Z

sayakpaul
Jul 30, 2024
Maintainer

With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.

We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.

We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.

Diffusers 🤝 Quanto ❤️