Skip to content

codygreen/docker-model-runner-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Docker Model Runner Demo

This guide introduces the benefits of Docker Model Runner and demonstrates its use in both development and production environments. The included example shows how to summarize top Hacker News stories using a large language model (LLM) running in Docker.

Why Docker Model Runner?

Docker Model Runner offers several advantages for AI-powered application development:

  • Local, Secure Development: Run models locally in a secure, compliant environment.
  • OpenAI-Compatible Endpoints: Easily switch between local and cloud AI services without code changes.
  • Enterprise Integration: Works with Docker Desktop, reducing the need for additional software approvals.
  • Hardware Acceleration: Leverage Mac Silicon or NVIDIA GPU acceleration without complex setup.

What’s in This Demo?

This repository provides a practical example that:

  • Fetches top stories from Hacker News.
  • Stores story metadata and content in PostgreSQL.
  • Uses a quantized Gemma 3 LLM (via Docker Model Runner) to generate concise summaries.
  • Avoids redundant work by checking if a story and its summary already exist in the database.

How It Works

Workflow:

  1. Fetch Best Stories: Retrieve top story IDs from the Hacker News API.
  2. Get Story Metadata: Fetch details (title, URL, etc.) for each story.
  3. Check Database: Skip stories already summarized.
  4. Scrape Article Content: Download article HTML.
  5. Summarize with LLM: Use Docker Model Runner to generate a summary.
  6. Store Results: Save metadata, content, and summary in PostgreSQL.

Notes:

  • Uses the Hacker News API and Firebase client.
  • Handles errors for unreachable or non-HTML URLs.
  • All data is persisted in PostgreSQL (managed by Docker Compose).

Getting Started

Prerequisites

Verify Docker Model Runner

Check that Docker Model Runner is running:

docker model status

You should see output indicating it is running.

Pull the Gemma 3 Model

Download the quantized Gemma 3 model:

docker model pull ai/gemma3:4B-Q4_K_M

Development Environment

This project uses a VS Code dev container for a consistent Python development setup with Docker and PostgreSQL. The dev container:

  • Installs Python, Docker CLI, Git, and GitHub CLI.
  • Uses Docker Compose to run the app and PostgreSQL.
  • Sets MODEL_HOST to http://model-runner.docker.internal for model access.

Build and Run Dev Container

Open the VS Code Command Palette (⇧⌘P on macOS), select DevContainers: Rebuild and Reopen in Container.

Test Docker Model Runner

Inside the dev container terminal, run:

curl $MODEL_HOST/models

You should see a JSON list of available models.

Production Environment

Docker Model Runner provides OpenAI API Spec compliant endpoints, so you can point MODEL_HOST to a production inference server if needed. For this demo, we use the local Docker Model Runner endpoint.

Build and Run the Demo

Clone the repository and start the containers:

cd hackernews-summary
docker compose up --build --exit-code-from app

The app will process stories and exit when done.

Clean Up

Remove containers and models:

docker compose down --rmi all --volumes
docker model rm ai/gemma3:4B-Q4_K_M

Additional Tips

  • All required tools (docker, git, gh, python3, etc.) are pre-installed in the dev container.

References


Summary:
This project provides a clear, hands-on example of using Docker Model Runner to run LLMs in a reproducible, containerized environment. Follow the steps above to fetch, summarize, and store Hacker News stories using modern, enterprise-ready tooling.

About

Demo of Docker Model Runner in both development and production environments.

Topics

Resources

Stars

Watchers

Forks