This guide introduces the benefits of Docker Model Runner and demonstrates its use in both development and production environments. The included example shows how to summarize top Hacker News stories using a large language model (LLM) running in Docker.
Docker Model Runner offers several advantages for AI-powered application development:
- Local, Secure Development: Run models locally in a secure, compliant environment.
- OpenAI-Compatible Endpoints: Easily switch between local and cloud AI services without code changes.
- Enterprise Integration: Works with Docker Desktop, reducing the need for additional software approvals.
- Hardware Acceleration: Leverage Mac Silicon or NVIDIA GPU acceleration without complex setup.
This repository provides a practical example that:
- Fetches top stories from Hacker News.
- Stores story metadata and content in PostgreSQL.
- Uses a quantized Gemma 3 LLM (via Docker Model Runner) to generate concise summaries.
- Avoids redundant work by checking if a story and its summary already exist in the database.
Workflow:
- Fetch Best Stories: Retrieve top story IDs from the Hacker News API.
- Get Story Metadata: Fetch details (title, URL, etc.) for each story.
- Check Database: Skip stories already summarized.
- Scrape Article Content: Download article HTML.
- Summarize with LLM: Use Docker Model Runner to generate a summary.
- Store Results: Save metadata, content, and summary in PostgreSQL.
Notes:
- Uses the Hacker News API and Firebase client.
- Handles errors for unreachable or non-HTML URLs.
- All data is persisted in PostgreSQL (managed by Docker Compose).
- Docker Desktop with Docker Model Runner enabled.
- VS Code with the Dev Containers extension.
Check that Docker Model Runner is running:
docker model status
You should see output indicating it is running.
Download the quantized Gemma 3 model:
docker model pull ai/gemma3:4B-Q4_K_M
This project uses a VS Code dev container for a consistent Python development setup with Docker and PostgreSQL. The dev container:
- Installs Python, Docker CLI, Git, and GitHub CLI.
- Uses Docker Compose to run the app and PostgreSQL.
- Sets
MODEL_HOST
tohttp://model-runner.docker.internal
for model access.
Open the VS Code Command Palette (⇧⌘P
on macOS), select DevContainers: Rebuild and Reopen in Container
.
Inside the dev container terminal, run:
curl $MODEL_HOST/models
You should see a JSON list of available models.
Docker Model Runner provides OpenAI API Spec compliant endpoints, so you can point MODEL_HOST
to a production inference server if needed. For this demo, we use the local Docker Model Runner endpoint.
Clone the repository and start the containers:
cd hackernews-summary
docker compose up --build --exit-code-from app
The app will process stories and exit when done.
Remove containers and models:
docker compose down --rmi all --volumes
docker model rm ai/gemma3:4B-Q4_K_M
- All required tools (
docker
,git
,gh
,python3
, etc.) are pre-installed in the dev container.
Summary:
This project provides a clear, hands-on example of using Docker Model Runner to run LLMs in a reproducible, containerized environment. Follow the steps above to fetch, summarize, and store Hacker News stories using modern, enterprise-ready tooling.