Multi-Intent AI Chatbot Assistant

Overview

The Multi-Intent AI Chatbot Assistant helps service and analytics teams answer both product-related and account-specific questions quickly, accurately, and securely.

The project evolves through three practical stages:

Phase 1 - Pre-LLM (Deterministic Pilot)
Offline, rule-based chatbot that uses FAISS for document search and keyword-to-SQL mapping.
Phase 2 - Full LLM (Production)
Retrieval-augmented generation (RAG) platform with microservices, continuous feedback, and observability.
Phase 3 - Scaling and Orchestration (Kubernetes)
Expands Phase 2 into a self-healing, auto-scaling, cloud-native platform.

Phase 1 - Pre-LLM Pilot (4-6 weeks)

Goal
Prove the concept with an explainable system that runs entirely offline.

Core Stack

FastAPI backend
FAISS vector search with SentenceTransformers embeddings
Keyword-based SQL generation with validation guardrails
SQLite mock contract database
Docker for deployment and CI/CD

What It Does

Classifies intent (knowledge, contract, or unknown).
Retrieves answers from local docs or SQL queries.
Applies guardrails for SQL safety, PII protection, and prompt injection defense.

Key Metrics

Objective	Metric	Target	Owner
Accuracy	Intent Classification	≥ 80%	Data Science
Speed	Response Latency	< 3 s	Engineering
Security	SQL Validation	100% Safe	Security
Experience	Positive Feedback	≥ 70%	CX Team

Outcome
A reliable, low-cost prototype that proves feasibility and governance readiness before introducing LLMs.

Phase 2 - Full LLM Production (3-6 months)

Goal
Scale the pilot into a production-grade platform that combines LLMs with retrieval and structured data.

Core Stack

FastAPI microservices on Docker or managed containers (ECS)
GPT-4 Turbo integrated with FAISS (RAG pattern)
Natural-language-to-SQL via LLM
RLHF feedback and retraining loop
Prometheus, Grafana, and OpenTelemetry for monitoring
Helm for deployment templating (preparing for Kubernetes)
Role-based access control and guardrails

Helm Clarification
Helm is introduced in Phase 2 as a templating and deployment abstraction to prepare for Kubernetes.
It is fully adopted in Phase 3 as part of the orchestration stack.

What It Adds

LLM-assisted intent classification in the Router Service
Contextual answers through RAG in Knowledge Service
LLM-generated SQL queries in Contract Service
Continuous learning via feedback loops

Key Metrics

Objective	Metric	Target	Owner
Reliability	Uptime	≥ 99.9%	DevOps
Performance	Latency (P95 including LLM)	< 2 s	Engineering
Governance	Drift Detection	Automated	Data Ops
Cost Efficiency	Average Cost per Query	< $0.05	Finance
Learning Cycle	Model Update Cadence	Weekly Retraining	Data Science

Outcome
An enterprise-ready AI assistant that combines structured data, documentation, and natural conversation with transparency and traceability.

Phase 3 - Scaling and Orchestration (6-12 months)

Goal
Turn Phase 2 into a cloud-native, self-healing platform that scales automatically with demand.

Core Stack Enhancements

Kubernetes (GKE, EKS, AKS) for orchestration
Helm for automated deployments
Horizontal Pod Autoscaler (HPA) for load scaling
Ingress and Load Balancer for global routing
GitOps (Argo CD or Flux) for continuous rollout
Unified observability with Prometheus and Grafana

What It Delivers

Multi-node Kubernetes cluster with containerized services
Rolling updates and zero-downtime deployments
Centralized logs, metrics, and health monitoring
Elastic scaling for varying workloads

Key Metrics

Objective	Metric	Target	Owner
Scalability	Pod Expansion Under Load	< 1 min Reaction Time	DevOps
Reliability	SLA Uptime	≥ 99.95%	DevOps
Efficiency	Node Utilization	≥ 80%	Finance
Deployment	Rollout Downtime	0% (Zero-Downtime Guaranteed)	Platform Team

Outcome
A global, cloud-native chatbot platform that scales intelligently and recovers automatically — ready for enterprise traffic and future model integrations.

Architecture Overview

Phase 1 - Pre-LLM Pilot

flowchart TD
    A[User Interface] --> B[Intent Classifier]
    B --> C{Router}
    C -->|Knowledge Query| D[Knowledge Agent - FAISS Vector DB]
    C -->|Contract Query| E[Contract Agent - Keyword to SQL Mapping]
    D --> F[Response Composer]
    E --> F
    F --> G[Chat Response]

    subgraph Guardrails
        H[PII Filter]
        I[Prompt Injection Detector]
        J[SQL Validator]
    end
    F --> H
    C --> I
    E --> J

Phase 2 - LLM Production

flowchart TD
    A[User or Agent UI] --> B[API Gateway]
    B --> C[Router Service - LLM Assisted Intent Classification]
    C -->|Knowledge Request| D[Knowledge Service - RAG with FAISS and LLM]
    C -->|Contract Request| E[Contract Service - LLM for SQL Generation]
    C -->|Feedback| F[Feedback Service - RLHF Loop]
    D --> G[Response Composer]
    E --> G
    F --> H[Feedback Store]
    G --> I[Analytics Dashboard]

    subgraph Observability
        K[Prometheus, Grafana, OpenTelemetry]
    end
    C --> K
    D --> K
    E --> K
    F --> K

Repository Structure

multi-intent-ai-chatbot-assistant/
├── phase1_pilot/
│   ├── app/
│   │   ├── main.py
│   │   ├── router.py
│   │   ├── intent_classifier.py
│   │   ├── chains.py
│   │   ├── contract_agent.py
│   │   └── utils.py
│   ├── guardrails/
│   │   ├── pii_filter.py
│   │   ├── sql_validator.py
│   │   └── prompt_injection_guard.py
│   ├── data/
│   │   ├── user_guide_sample.txt
│   │   └── mock_contracts.sql
│   ├── evals/
│   │   └── eval_results_phase1.md
│   ├── Dockerfile
│   └── ci_cd.yaml
│
├── phase2_production/
│   ├── services/
│   │   ├── router_service.py
│   │   ├── knowledge_service.py
│   │   ├── contract_service.py
│   │   ├── feedback_service.py
│   │   └── utils.py
│   ├── helm/
│   │   ├── deployment.yaml
│   │   └── secrets.yaml
│   ├── observability/
│   │   ├── prometheus_config.yml
│   │   └── grafana_dashboard.json
│   ├── evals/
│   │   └── eval_results_phase2.md
│   ├── .env.example
│   ├── Dockerfile
│   └── ci_cd_pipeline.yaml
│
└── phase3_scaling/
    ├── helm/
    │   ├── deployment.yaml
    │   └── values.yaml
    ├── observability/
    │   ├── prometheus_config.yml
    │   ├── grafana_dashboard.json
    │   └── alerts.yaml
    ├── gitops/
    │   └── argo_cd_pipeline.yaml
    ├── docs/
    │   └── phase3_scaling_overview.md
    └── README_phase3.md

Contact

Developed by James W. Niu
Questions: jameswnarch@gmail.com

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
phase1_pilot		phase1_pilot
phase3_scaling		phase3_scaling
phase_2_production		phase_2_production
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Intent AI Chatbot Assistant

Overview

Phase 1 - Pre-LLM Pilot (4-6 weeks)

Phase 2 - Full LLM Production (3-6 months)

Phase 3 - Scaling and Orchestration (6-12 months)

Architecture Overview

Phase 1 - Pre-LLM Pilot

Phase 2 - LLM Production

Repository Structure

Contact

License

About

Uh oh!

Releases

Packages

Languages

License

jameswniu/multi-agent-intent-routing-chatbot-assistant

Folders and files

Latest commit

History

Repository files navigation

Multi-Intent AI Chatbot Assistant

Overview

Phase 1 - Pre-LLM Pilot (4-6 weeks)

Phase 2 - Full LLM Production (3-6 months)

Phase 3 - Scaling and Orchestration (6-12 months)

Architecture Overview

Phase 1 - Pre-LLM Pilot

Phase 2 - LLM Production

Repository Structure

Contact

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages