Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
-
Updated
Feb 14, 2023 - Python
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
FaceTron is a high-performance face embedding server using ONNX Runtime, supporting dynamic multi-model loading, offline deployment, and scalable environments. It exposes an OpenAPI endpoint with MCP-compatible metadata and integrates with OpenTelemetry for observability.
Add a description, image, and links to the modelserver topic page so that developers can more easily learn about it.
To associate your repository with the modelserver topic, visit your repo's landing page and select "manage topics."