A flexible, extensible AI agent backend built with NestJS—designed for running local, open-source LLMs (Llama, Gemma, Qwen, DeepSeek, etc.) via Docker Model Runner. Real-time streaming, Redis messaging, web search, and Postgres memory out of the box. No cloud APIs required!
- Clone the repository
git clone <your-repo-url> cd <your-repo-folder>
- Copy and edit environment variables
cp .env.example .env # Edit .env and fill in your model and service config
- Start required services (Redis, PostgreSQL, Local LLM) with Docker Compose
docker compose up -d
- PostgreSQL:
localhost:5433
- Redis:
localhost:6379
- Local LLM runner:
localhost:12434
(Model Runner guide)
- PostgreSQL:
- Install dependencies
pnpm install
- Start the development server
pnpm run start:dev
See .env.example
for all options. Key variables:
MODEL_BASE_URL
— e.g.http://localhost:12434/engines/llama.cpp/v1
MODEL_NAME
— e.g.ai/gemma3:latest
,llama-3
,qwen
,deepseek
TAVILY_API_KEY
— for web search (Get your key)REDIS_HOST
,REDIS_PORT
, etc.POSTGRES_*
— for memory
- 🤖 Local, open-source LLMs (Llama, Gemma, Qwen, DeepSeek, etc.)
- 🌊 Real-time streaming responses
- 💾 Conversation history with Postgres memory
- 🌐 Web search integration (Tavily)
- 🧵 Custom ThreadService for conversations
- 📡 Redis pub/sub for real-time messaging
- 🎯 Clean, maintainable architecture
- This project is designed for local LLMs only, using Docker Model Runner.
- Supported models: Llama, Gemma, Qwen, DeepSeek, and other open-source models.
- Set
MODEL_BASE_URL
andMODEL_NAME
in your.env
. - Start the
ai_runner
service with Docker Compose. - For other providers, see Agent Initializr.
- Set
TAVILY_API_KEY
in.env
- Example usage in code:
AgentFactory.createAgent( ModelProvider.LOCAL, [new TavilySearch({ maxResults: 5, topic: 'general' })], postgresCheckpointer, );
src/
├── agent/ # AI agent implementation
├── api/ # HTTP endpoints and DTOs
└── messaging/ # Redis messaging service
POST /api/agent/chat
— Send a message to the agentGET /api/agent/stream
— Stream agent responses (SSE)GET /api/agent/history/:threadId
— Get conversation historyGET /api/agent/threads
— List all threads
For a ready-to-use frontend, use agentailor-chat-ui, which is fully compatible with this backend.
This project uses Postgres for memory. You must initialize the checkpointer before chatting:
// In agentService
async stream(message: SseMessageDto): Promise<Observable<SseMessage>> {
const channel = `agent-stream:${message.threadId}`;
// Run only once
this.agent.initCheckpointer();
// ...rest of code
}
- This project is opinionated for local, open-source LLMs only.
For more details and project resources, visit Initializr.