-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Detokenize option in /v1/completions request
Community Engagement
help/insights needed from community
Community want to contribute
PRs initiated from Community
#5382
opened Jun 20, 2025 by
Wokzy
Loading…
Fix: missing clientId when serialize and deserialize response (cherry-pick #5231)
#5378
opened Jun 20, 2025 by
kaiyux
Loading…
[TRTLLM-5831][feat] Add LoRA support for pytorch backend in trtllm-serve
#5376
opened Jun 19, 2025 by
talorabr
Loading…
Fix permission for local user issues in NGC docker container.
#5373
opened Jun 19, 2025 by
MartinMarciniszyn
Loading…
[TRTLLM-5838][fix] fix max batch size and max tokens in kv cache estimations for Nemotron-H
#5371
opened Jun 19, 2025 by
tomeras91
Loading…
feat: add LLmArgs option to force using dynamic quantization
#5346
opened Jun 19, 2025 by
achartier
Loading…
[fix] Add 1 and draft_token_num to seq_len when overlap scheduling is enabled during memory estimation
#5343
opened Jun 19, 2025 by
HuiGao-NV
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-06-17.