Draft Attention

This repository provides an overview of all resources for the paper "DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance".

Draft Attention is a plug-and-play acceleration method for video diffusion transformers.

Draft Attention reshapes long queries and keys into frame-wise feature maps and applying 2D average pooling to downsample them.

Draft Attention provides the reference for the sparse attention in full length.

Draft Attention introduces minimal overhead by compressing the number of tokens 128x or larger.

🔥 News

[2025/05] We support HunyuanCustom with classifier free guidance.

🎥 Demo

Hunyuan

Dense Attention

Sparse Video Generation (SVG)

Draft Attention (Ours)

Prompt: "The banks of the Thames, as the camera moves vertically from low to high."

Dense Attention

Sparse Video Generation (SVG)

Draft Attention (Ours)

Prompt: "On the green grass, the white-walled Leaning Tower of Pisa stands tall. The camera moves vertically from top to bottom during filming."

Dense Attention

Sparse Video Generation (SVG)

Draft Attention (Ours)

Prompt: "A blue long dress fell from the balcony clothes rack and dropped into the water on the ground."

Prompts are all from the Penguin Video Benchmark.

Videos are generated with sparsity 90%, seed 42, using Hunyuan model in 768p on A100 GPU.

HunyuanCustom

Input Image

Dense Attention

Draft Attention (Ours)

Prompt: "Realistic, High-quality. A woman is drinking coffee at a café."

Videos are generated with seed 42 in 768p resolution on 8xA100 GPUs, with either dense attention or 90% sparse attention.

🚀 Quick Start

Model Preparation

Please follow the instruction of environment setup and download the checkpoint from HunyuanVideo, Wan2.1, and HunyuanCustom.

Sparse Attention

We mainly adopt the block sparse attention for draft attention.

Video Generation

Simply run video generation with scripts in hunyuan/, wan/ or hunyuan_custom/.

Evaluation results in the paper are mainly achieved with VBench on Penguin Video Benchmark using HunyuanVideo and Wan2.1.

Use for Your Own

You can simply use the draft attention similar as the flash attention through the Draft_Attention defined in draft_attention.py or draft_attention_classifier_free_guidance.py.

Here is the example for hunyuan model:

from draft_attention import Draft_Attention

draft_attention = Draft_Attention(
    pool_h=8,
    pool_w=16,
    latent_h=48,
    latent_w=80,
    visual_len=126_720,
    text_len=256,
    sparsity_ratio=0.9,
)

x = draft_attention(
    q,
    k,
    v,
    attn_mask=attn_mask,
    causal=causal,
    drop_rate=drop_rate,
    cu_seqlens_q=cu_seqlens_q,
    cu_seqlens_kv=cu_seqlens_kv,
    max_seqlen_q=max_seqlen_q,
    max_seqlen_kv=max_seqlen_kv,
    batch_size=batch_size,
)

✏️ TODO

Support any-resolution video generation with padding.
Support reordering of further block sparse grouping for faster hardware execution.

📑 Acknowledgement

This work is mainly contributed by Xuan and Chenxia.

🔗 BibTeX

If you find Draft Attention is interesting, please cite through BibTeX:

@article{shen2025draft,
  title={DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance},
  author={Shen, Xuan and Han, Chenxia and Zhou, Yufa and Xie, Yanyue and Gong, Yifan and Wang, Quanyi and Wang, Yiwei and Wang, Yanzhi and Zhao, Pu and Gu, Jiuxiang},
  journal={arXiv preprint arXiv:2505.14708},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
assets		assets
hunyuan		hunyuan
hunyuan_custom		hunyuan_custom
wan		wan
.gitignore		.gitignore
README.md		README.md
draft_attention.py		draft_attention.py
draft_attention_classifier_free_guidance.py		draft_attention_classifier_free_guidance.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Draft Attention

🔥 News

🎥 Demo

Hunyuan

HunyuanCustom

🚀 Quick Start

Model Preparation

Sparse Attention

Video Generation

Use for Your Own

✏️ TODO

📑 Acknowledgement

🔗 BibTeX

About

Uh oh!

Releases

Packages

Languages

xlite-dev/draft-attention

Folders and files

Latest commit

History

Repository files navigation

Draft Attention

🔥 News

🎥 Demo

Hunyuan

HunyuanCustom

🚀 Quick Start

Model Preparation

Sparse Attention

Video Generation

Use for Your Own

✏️ TODO

📑 Acknowledgement

🔗 BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages