ModelCloud.ai
Pinned Loading
Repositories
- GPTQModel Public
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
ModelCloud/GPTQModel’s past year of commit activity - lm-evaluation-harness Public Forked from EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
ModelCloud/lm-evaluation-harness’s past year of commit activity - vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
ModelCloud/vllm’s past year of commit activity - rockthem Public
ModelCloud/rockthem’s past year of commit activity - Tokenicer Public
A (nicer) tokenizer you want to use for model inference and training: with all known peventable gotchas normalized or auto-fixed.
ModelCloud/Tokenicer’s past year of commit activity - sglang Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
ModelCloud/sglang’s past year of commit activity - Device-SMI Public
Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separate tools such as nvidia-smi or /proc/cpuinfo and parsing it yourself.
ModelCloud/Device-SMI’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…