Skip to content

[fix] Add 1 and draft_token_num to seq_len when overlap scheduling is enabled during memory estimation#5343

Merged
HuiGao-NV merged 6 commits intoNVIDIA:mainfrom
HuiGao-NV:extra_token_for_overlap
Jun 24, 2025
Merged

[fix] Add 1 and draft_token_num to seq_len when overlap scheduling is enabled during memory estimation#5343
HuiGao-NV merged 6 commits intoNVIDIA:mainfrom
HuiGao-NV:extra_token_for_overlap

Commits

Commits on Jun 20, 2025

Commits on Jun 23, 2025

Commits on Jun 24, 2025