A unified, provider-agnostic chat completions API server that seamlessly integrates OpenAI and AWS Bedrock through a single endpoint with intelligent format detection and conversion.
- π Unified Endpoint: Single
/v1/chat/completions
endpoint handles all providers and formats - π§ Smart Format Detection: Automatically detects OpenAI, Bedrock Claude, and Bedrock Titan formats
- π Seamless Conversion: Convert between any format combination (OpenAI β Bedrock Claude β Bedrock Titan)
- β‘ Model-Based Routing: Automatic provider selection based on model ID patterns
- π Streaming Support: Real-time streaming with format preservation
- π‘οΈ Enterprise Ready: Authentication, logging, error handling, and monitoring
- π» Full CLI: Interactive chat, server management, and configuration tools
- π OpenAI Compatible: Drop-in replacement for OpenAI Chat Completions API
- Quick Start
- Unified Endpoint
- Format Support
- Installation
- Configuration
- Usage Examples
- CLI Reference
- API Documentation
- Architecture
- Contributing
# Clone the repository
git clone https://github.com/teabranch/amazon-chat-completions-server.git
cd amazon-chat-completions-server
# Install with uv (recommended)
uv pip install -e .
# Or with pip
pip install -e .
# Interactive configuration setup
amazon-chat config set
# Or manually create .env file
cp .env.example .env
# Edit .env with your API keys
# Start the server
amazon-chat serve --host 0.0.0.0 --port 8000
# Server will be available at:
# π API: http://localhost:8000
# π Docs: http://localhost:8000/docs
# OpenAI format β OpenAI response
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
# OpenAI format β Bedrock Claude response
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-d '{
"model": "anthropic.claude-3-haiku-20240307-v1:0",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
# Bedrock Claude format β OpenAI response
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-d '{
"anthropic_version": "bedrock-2023-05-31",
"model": "anthropic.claude-3-haiku-20240307-v1:0",
"max_tokens": 1000,
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
The /v1/chat/completions
endpoint is the only endpoint you need. It:
- Auto-detects your input format (OpenAI, Bedrock Claude, Bedrock Titan)
- Routes to the appropriate provider based on model ID
- Converts between formats as needed
- Streams responses in real-time when requested
- Returns responses in your preferred format
graph LR
A[Any Format Request] --> B[Format Detection]
B --> C[Model-Based Routing]
C --> D[Provider API Call]
D --> E[Format Conversion]
E --> F[Unified Response]
style A fill:#e1f5fe
style F fill:#e8f5e8
The server automatically routes requests based on model ID patterns:
Model Pattern | Provider | Examples |
---|---|---|
gpt-* |
OpenAI | gpt-4o-mini , gpt-3.5-turbo |
text-* |
OpenAI | text-davinci-003 |
dall-e-* |
OpenAI | dall-e-3 |
anthropic.* |
Bedrock | anthropic.claude-3-haiku-20240307-v1:0 |
amazon.* |
Bedrock | amazon.titan-text-express-v1 |
ai21.* , cohere.* , meta.* |
Bedrock | Various Bedrock models |
Input Format | Output Format | Use Case | Streaming |
---|---|---|---|
OpenAI | OpenAI | Standard OpenAI usage | β |
OpenAI | Bedrock Claude | OpenAI clients β Claude responses | β |
OpenAI | Bedrock Titan | OpenAI clients β Titan responses | β |
Bedrock Claude | OpenAI | Claude clients β OpenAI responses | β |
Bedrock Claude | Bedrock Claude | Native Claude usage | β |
Bedrock Titan | OpenAI | Titan clients β OpenAI responses | β |
Bedrock Titan | Bedrock Titan | Native Titan usage | β |
OpenAI Format:
{
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}
Bedrock Claude Format:
{
"anthropic_version": "bedrock-2023-05-31",
"model": "anthropic.claude-3-haiku-20240307-v1:0",
"max_tokens": 1000,
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.7,
"stream": false
}
Bedrock Titan Format:
{
"model": "amazon.titan-text-express-v1",
"inputText": "User: Hello!\n\nBot:",
"textGenerationConfig": {
"maxTokenCount": 1000,
"temperature": 0.7
}
}
- Python 3.12+
- OpenAI API key (for OpenAI models)
- AWS credentials (for Bedrock models)
# Option 1: Development installation
git clone https://github.com/teabranch/amazon-chat-completions-server.git
cd amazon-chat-completions-server
uv pip install -e .
# Option 2: Direct from GitHub
pip install git+https://github.com/teabranch/amazon-chat-completions-server.git
# Option 3: Local wheel
pip install dist/amazon_chat_completions_server-*.whl
# Check CLI is available
amazon-chat --help
# Check version
amazon-chat config show
Create a .env
file with your configuration:
# Required: Server Authentication
API_KEY=your-secret-api-key
# Required: OpenAI (if using OpenAI models)
OPENAI_API_KEY=sk-your-openai-api-key
# Required: AWS (if using Bedrock models)
# Option 1: Static credentials
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_REGION=us-east-1
# Option 2: AWS Profile (alternative to static credentials)
AWS_PROFILE=your-aws-profile
AWS_REGION=us-east-1
# Option 3: AWS Role Assumption (for cross-account access or enhanced security)
AWS_ROLE_ARN=arn:aws:iam::123456789012:role/MyBedrockRole
AWS_EXTERNAL_ID=your-external-id # Optional, for cross-account role assumption
AWS_ROLE_SESSION_NAME=amazon-chat-completions-session # Optional
AWS_ROLE_SESSION_DURATION=3600 # Optional, session duration in seconds
AWS_REGION=us-east-1
# Option 4: Web Identity Token (for OIDC/Kubernetes service accounts)
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_ROLE_ARN=arn:aws:iam::123456789012:role/MyWebIdentityRole
AWS_REGION=us-east-1
# Optional: Defaults
DEFAULT_OPENAI_MODEL=gpt-4o-mini
LOG_LEVEL=INFO
CHAT_SERVER_URL=http://localhost:8000
CHAT_API_KEY=your-secret-api-key
The server supports multiple AWS authentication methods with automatic detection and fallback. Choose the method that best fits your deployment scenario:
Direct AWS access keys - simplest but least secure:
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
AWS_SESSION_TOKEN=your-session-token # Optional, for temporary credentials
AWS_REGION=us-east-1
Use cases: Local development, testing, CI/CD pipelines
Security:
Uses AWS CLI configured profiles:
AWS_PROFILE=your-aws-profile
AWS_REGION=us-east-1
Setup:
# Configure AWS CLI profile
aws configure --profile your-aws-profile
# Or use AWS SSO
aws configure sso --profile your-aws-profile
Use cases: Local development with multiple AWS accounts Security: β Credentials managed by AWS CLI
Assume an IAM role using base credentials:
# Base credentials (required for role assumption)
AWS_PROFILE=base-profile
# OR
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# Role to assume
AWS_ROLE_ARN=arn:aws:iam::123456789012:role/BedrockAccessRole
AWS_EXTERNAL_ID=unique-external-id # Optional, for cross-account scenarios
AWS_ROLE_SESSION_NAME=amazon-chat-completions-session # Optional
AWS_ROLE_SESSION_DURATION=3600 # Optional, 900-43200 seconds (default: 3600)
AWS_REGION=us-east-1
Role Trust Policy Example:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::SOURCE-ACCOUNT:user/username"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "unique-external-id"
}
}
}
]
}
Use cases: Cross-account access, temporary elevated permissions, security compliance Security: β Time-limited sessions, audit trail, least privilege
For containerized environments with OIDC providers:
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_ROLE_ARN=arn:aws:iam::123456789012:role/EKSBedrockRole
AWS_ROLE_SESSION_NAME=amazon-chat-completions-session # Optional
AWS_REGION=us-east-1
Kubernetes ServiceAccount Example:
apiVersion: v1
kind: ServiceAccount
metadata:
name: amazon-chat-completions
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/EKSBedrockRole
Use cases: EKS, GitHub Actions, GitLab CI, other OIDC providers Security: β No long-term credentials, automatic rotation
No configuration needed - uses AWS default credential chain:
# Only region required
AWS_REGION=us-east-1
Credential Chain Order:
- Environment variables (
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
) - Shared credentials file (
~/.aws/credentials
) - Shared config file (
~/.aws/config
) - IAM instance profiles (EC2)
- ECS task roles
- EKS service account roles
Use cases: EC2 instances, ECS tasks, Lambda functions Security: β No credential management, automatic rotation
The server uses this priority order for authentication:
- Static Credentials (
AWS_ACCESS_KEY_ID
+AWS_SECRET_ACCESS_KEY
) - AWS Profile (
AWS_PROFILE
) - Role Assumption (
AWS_ROLE_ARN
with base credentials) - Web Identity Token (
AWS_WEB_IDENTITY_TOKEN_FILE
) - Default Credential Chain (instance profiles, etc.)
β "Role assumption requires base AWS credentials"
# Problem: AWS_ROLE_ARN set but no base credentials
# Solution: Add base credentials
AWS_PROFILE=your-profile # OR
AWS_ACCESS_KEY_ID=your-key
AWS_SECRET_ACCESS_KEY=your-secret
β "The config profile (profile-name) could not be found"
# Problem: Profile doesn't exist
# Solution: Configure the profile
aws configure --profile profile-name
# Or check existing profiles
aws configure list-profiles
β "Access denied when assuming role"
# Problem: Role trust policy or permissions issue
# Solution: Check role trust policy allows your principal
# Verify role has bedrock:* permissions
β "Web identity token file not found"
# Problem: Token file path incorrect
# Solution: Verify the token file exists
ls -la /var/run/secrets/eks.amazonaws.com/serviceaccount/token
β "You must specify a region"
# Problem: AWS_REGION not set
# Solution: Set the region
AWS_REGION=us-east-1
# Test your AWS configuration
amazon-chat config test-aws
# Or manually test with AWS CLI
aws sts get-caller-identity --profile your-profile
aws bedrock list-foundation-models --region us-east-1
Your AWS credentials need these minimum permissions for Bedrock:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
"bedrock:ListFoundationModels"
],
"Resource": "*"
}
]
}
For role assumption, also add:
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::*:role/YourBedrockRole"
}
# Run interactive setup
amazon-chat config set
# View current configuration (sensitive values masked)
amazon-chat config show
# Set specific values
amazon-chat config set --key OPENAI_API_KEY --value sk-your-key
# Test AWS authentication
amazon-chat config test-aws
π For detailed AWS authentication documentation, see AWS Authentication Guide
import httpx
# OpenAI format request
response = httpx.post(
"http://localhost:8000/v1/chat/completions",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer your-api-key"
},
json={
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}
)
print(response.json())
# Send OpenAI format, get Bedrock Claude response
response = httpx.post(
"http://localhost:8000/v1/chat/completions?target_format=bedrock_claude",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer your-api-key"
},
json={
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}
)
# Response will be in Bedrock Claude format
claude_response = response.json()
print(claude_response["content"][0]["text"])
import httpx
# Streaming request
with httpx.stream(
"POST",
"http://localhost:8000/v1/chat/completions",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer your-api-key"
},
json={
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Tell me a story"}],
"stream": True
}
) as response:
for line in response.iter_lines():
if line.startswith("data: "):
data = line[6:] # Remove "data: " prefix
if data != "[DONE]":
chunk = json.loads(data)
if chunk["choices"][0]["delta"].get("content"):
print(chunk["choices"][0]["delta"]["content"], end="")
# Tool calling with OpenAI format
response = httpx.post(
"http://localhost:8000/v1/chat/completions",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer your-api-key"
},
json={
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "What's the weather in London?"}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}
)
# Check for tool calls in response
if response.json()["choices"][0]["message"].get("tool_calls"):
print("Model wants to call a tool!")
# Interactive chat session
amazon-chat chat --model gpt-4o-mini
# Start server
amazon-chat serve --host 0.0.0.0 --port 8000
# Configuration management
amazon-chat config set
amazon-chat config show
# List available models
amazon-chat models
# Get help
amazon-chat --help
amazon-chat COMMAND --help
# Start interactive chat
amazon-chat chat --model gpt-4o-mini --stream
# Chat with custom settings
amazon-chat chat \
--model anthropic.claude-3-haiku-20240307-v1:0 \
--temperature 0.8 \
--max-tokens 500
# Chat with custom server
amazon-chat chat \
--server-url https://my-server.com \
--api-key my-key
# Development server with auto-reload
amazon-chat serve --reload --log-level debug
# Production server
amazon-chat serve \
--host 0.0.0.0 \
--port 8000 \
--workers 4 \
--env-file production.env
When the server is running, visit:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- OpenAPI Schema: http://localhost:8000/openapi.json
Endpoint | Method | Description |
---|---|---|
/v1/chat/completions |
POST | Unified chat completions endpoint |
/v1/models |
GET | List available models |
/health |
GET | General health check |
/v1/chat/completions/health |
GET | Unified endpoint health check |
All endpoints require authentication via the Authorization
Bearer token:
curl -H "Authorization: Bearer your-api-key" http://localhost:8000/v1/models
graph TD
A[Client Request] --> B[/v1/chat/completions]
B --> C{Format Detection}
C --> D1[OpenAI Format]
C --> D2[Bedrock Claude Format]
C --> D3[Bedrock Titan Format]
D1 --> E{Model-Based Routing}
D2 --> F[Format Conversion] --> E
D3 --> F
E --> G1[OpenAI Service]
E --> G2[Bedrock Service]
G1 --> H1[OpenAI API]
G2 --> H2[AWS Bedrock API]
H1 --> I[Response Processing]
H2 --> I
I --> J{Target Format?}
J --> K1[OpenAI Response]
J --> K2[Bedrock Claude Response]
J --> K3[Bedrock Titan Response]
K1 --> L[Client Response]
K2 --> L
K3 --> L
- Format Detection: Automatically identifies input format
- Model-Based Routing: Routes to appropriate provider based on model ID
- Service Layer: Abstracts provider-specific implementations
- Adapter Layer: Handles format conversions
- Strategy Pattern: Manages different model families within providers
- Single Responsibility: Each component has a clear, focused purpose
- Open/Closed: Easy to extend with new providers or formats
- Dependency Inversion: High-level modules don't depend on low-level details
- Interface Segregation: Clean, minimal interfaces between components
The project includes several categories of tests:
- Unit Tests: Fast, isolated tests that don't make external API calls
- Integration Tests: Tests that make real API calls to external services
- External API Tests: Tests marked with
external_api
that are skipped in CI
# Run all tests (excluding external API tests)
uv run pytest
# Run all tests including external API tests (requires credentials)
uv run pytest -m "not external_api or external_api"
# Run only unit tests (no external API calls)
uv run pytest -m "not external_api"
# Run only OpenAI integration tests (requires OPENAI_API_KEY)
uv run pytest -m "openai_integration"
# Run only AWS/Bedrock integration tests (requires AWS credentials)
uv run pytest -m "aws_integration"
# Run all external API tests (requires all credentials)
uv run pytest -m "external_api"
# Run with coverage
uv run pytest --cov=src --cov-report=html
# Run specific test categories
uv run pytest tests/api/ # API tests
uv run pytest tests/cli/ # CLI tests
uv run pytest tests/core/ # Core functionality tests
External API tests require real credentials and are automatically skipped when:
- OpenAI Tests:
OPENAI_API_KEY
environment variable is not set - AWS/Bedrock Tests: AWS authentication is not configured (no
AWS_REGION
,AWS_PROFILE
, orAWS_ACCESS_KEY_ID
/AWS_SECRET_ACCESS_KEY
)
export OPENAI_API_KEY="sk-your-openai-api-key"
export TEST_OPENAI_MODEL="gpt-4o" # Optional, defaults to gpt-4o
uv run pytest -m "openai_integration"
# Option 1: Using AWS Profile
export AWS_PROFILE="your-aws-profile"
export AWS_REGION="us-east-1"
# Option 2: Using Access Keys
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"
# Option 3: Using Role Assumption
export AWS_PROFILE="your-base-profile"
export AWS_ROLE_ARN="arn:aws:iam::123456789012:role/YourRole"
export AWS_EXTERNAL_ID="your-external-id" # Optional
export AWS_REGION="us-east-1"
# Run AWS tests
uv run pytest -m "aws_integration"
- Regular CI: Runs unit tests only (
pytest -m "not external_api"
) - Integration Tests: Manual workflow for running external API tests with credentials
- Can be triggered manually from GitHub Actions tab
- Supports running OpenAI tests, AWS tests, or both
- Uses repository secrets for API keys and credentials
- β 113+ tests passing with comprehensive coverage
- β Format detection and conversion
- β Model-based routing logic
- β Streaming functionality
- β Error handling and edge cases
- β Authentication and security
- β CLI commands and configuration
- β External API integration (when credentials available)
# Build image
docker build -t amazon-chat-completions-server .
# Run container
docker run -p 8000:8000 \
-e API_KEY=your-api-key \
-e OPENAI_API_KEY=sk-your-key \
amazon-chat-completions-server
- Environment Variables: Use secure secret management
- Load Balancing: Multiple server instances behind a load balancer
- Monitoring: Implement health checks and metrics collection
- Rate Limiting: Configure appropriate rate limits for your use case
- SSL/TLS: Use HTTPS in production environments
We welcome contributions! Please see our Contributing Guide for details.
# Clone and setup
git clone https://github.com/teabranch/amazon-chat-completions-server.git
cd amazon-chat-completions-server
# Install development dependencies
uv pip install -e ".[dev]"
# Run tests
python -m pytest
# Run linting
ruff check src tests
mypy src
- Implement
AbstractLLMService
interface - Add model ID patterns to
LLMServiceFactory
- Create format conversion adapters
- Add comprehensive tests
- Update documentation
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: docs/
- API Reference: docs/api-reference.md
- CLI Reference: docs/cli-reference.md
- Architecture Guide: docs/architecture.md
- Issues: GitHub Issues
Amazon Chat Completions Server - Unifying LLM providers through intelligent format detection and seamless conversion. π