NVIDIA/TensorRT-LLM

44 labels

AutoDeploy
bug
Something isn't working
CI
automated tests, build checks, github actions, system stability & efficiency.
Community Engagement
help/insights needed from community
Community want to contribute
PRs initiated from Community
Customized Kernels
Specialized/modified CUDA kernels in TRTLLM for LLM ops, beyond standard TRT. Dev & perf.
dependencies
Pull requests that update a dependency file
Disaggregated Serving
Deploying TRTLLM with separated, distributed components (params, kv-cache, compute). Arch & perf.
Documentation
TRTLLM's textual/illustrative materials: API refs, guides, tutorials. Improvement & clarity.
duplicate
This issue or pull request already exists
Ease of Use
Items about improving or complaints about TRTLLM ease of use
feature request
New feature or request. This includes new model, dtype, functionality support
functionality issue
Generic Runtime
General operational aspects of TRTLLM execution not in other categories.
help wanted
Extra attention is needed
Installation
Setting up and building TRTLLM: compilation, pip install, dependencies, env config, CMake.
Investigating
KV-Cache Management
kv-cache management for efficient LLM inference
LLM API/Workflow
High-level LLM Python API & tools (e.g., trtllm-llmapi-launch) for TRTLLM inference/workflows.
Lora/P-tuning
Parameter-Efficient Fine-Tuning (PEFT) like LoRA/P-tuning in TRTLLM: adapter use & perf.
Low Precision
Lower-precision formats (INT8/INT4/FP8) for TRTLLM quantization (AWQ, GPTQ).
Memory
Memory utilization in TRTLLM: leak/OOM handling, footprint optimization, memory profiling.
Merged
need more info
Further info is required from the requester for devs to help
new model
Request to add a new model
not a bug
Some known limitation, but not a bug.
OpenAI API
trtllm-serve's OpenAI-compatible API: endpoint behavior, req/resp formats, feature parity.
others
Performance
TRTLLM model inference speed, throughput, efficiency. Latency, benchmarks, regressions, opts.
Performance Config Help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Labels

44 labels