ModelCloud.ai
Pinned Loading
Repositories
- GPTQModel Public
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
ModelCloud/GPTQModel’s past year of commit activity - Device-SMI Public
Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separate tools such as nvidia-smi or /proc/cpuinfo and parsing it yourself.
ModelCloud/Device-SMI’s past year of commit activity - Tokenicer Public
A (nicer) tokenizer you want to use for model inference and training: with all known peventable gotchas normalized or auto-fixed.
ModelCloud/Tokenicer’s past year of commit activity - lm-evaluation-harness Public Forked from EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
ModelCloud/lm-evaluation-harness’s past year of commit activity - vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
ModelCloud/vllm’s past year of commit activity - sglang Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
ModelCloud/sglang’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Most used topics
Loading…