ADR-035: Container Deployment Strategy
Status: Accepted
Date: 2024-10-15
Deciders: Architecture Team, DevOps Team
Related: ADR-034 (Apps Folder Reorganization), ADR-009 (Azure Container Apps)
Context
The Loan Defenders system needed a production-ready containerization strategy that: - Supports local development testing - Enables Azure Container Apps deployment - Maintains security best practices - Optimizes for build speed and image size - Provides proper health checking and monitoring
Initial challenges: - Mixed development and production container needs - Import path issues in containerized Python code - Authentication complexity (local vs production) - Service networking and discovery - Health check configuration
Decision
Implement a comprehensive container strategy with: 1. Multi-stage Docker builds for all services 2. Security-first design with non-root users 3. Proper health checks for all containers 4. Service mesh networking via Docker Compose 5. Azure Container Apps deployment readiness
Container Architecture
5 Containerized Services:
┌─────────────────────────────────────────────────────────────┐
│ API Container (Python 3.11-slim) │
│ - FastAPI backend + Agent orchestration │
│ - Port 8000 │
│ - Health: /health endpoint │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ UI Container (Nginx Alpine) │
│ - React SPA served by Nginx │
│ - Port 8080 │
│ - Health: wget spider check │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ MCP Server Containers (x3) (Python 3.11-slim) │
│ - application_verification (Port 8010) │
│ - document_processing (Port 8011) │
│ - financial_calculations (Port 8012) │
│ - Health: /health endpoint │
└─────────────────────────────────────────────────────────────┘
Implementation Details
1. Multi-Stage Builds
Python Services (API + MCP Servers):
FROM python:3.11-slim AS builder
# Install dependencies with uv (10-100x faster than pip)
RUN pip install uv
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev
FROM python:3.11-slim
# Copy only runtime dependencies
COPY --from=builder /.venv /.venv
# Application code
COPY app/ /app/
Benefits: - Smaller final image (no build tools) - Faster builds with layer caching - Reproducible with locked dependencies
UI Service:
FROM node:20-alpine AS builder
# Build React app
RUN npm ci && npm run build
FROM nginx:alpine
# Copy only built static files
COPY --from=builder /app/dist /usr/share/nginx/html
Benefits: - 82MB final image (vs 1GB+ with Node) - Fast serving with Nginx - Minimal attack surface
2. Security Hardening
Non-Root Users:
Why UID 1000? - Matches common developer machine UIDs - Prevents root access in container - Required for security compliance
Applied to all containers:
- API: apiuser:apiuser
- UI: uiuser:uiuser
- MCP Servers: mcpuser:mcpuser
3. Health Checks
Configuration:
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
Parameters:
- start-period: 40s - Allow Python import and server startup
- interval: 30s - Check every 30 seconds
- timeout: 10s - Reasonable for HTTP check
- retries: 3 - Grace period for transient issues
Why 40s start period? - Python imports take ~10-15s - Azure AI client initialization ~10-15s - Agent framework loading ~10-15s - Buffer for container orchestrator
4. Service Networking
Docker Compose:
services:
api:
environment:
MCP_APPLICATION_VERIFICATION_URL: http://mcp-application-verification:8010/mcp
MCP_DOCUMENT_PROCESSING_URL: http://mcp-document-processing:8011/mcp
MCP_FINANCIAL_CALCULATIONS_URL: http://mcp-financial-calculations:8012/mcp
Key decisions:
- DNS-based service discovery (by service name)
- Internal network (no external exposure for MCP servers)
- /mcp endpoint path required by agent framework
Azure Container Apps mapping:
- Same DNS pattern with .internal. suffix
- Internal ingress for MCP servers
- External ingress only for UI
5. Authentication Strategy
Local Development (az login):
- Not available in containers
- Solution: Service Principal with credentials in .env
Docker Testing:
Azure Production: - Managed Identity (no credentials needed) - Container Apps system-assigned identity - RBAC: "Cognitive Services OpenAI User"
Build Optimization
Layer Caching Strategy
Order matters:
1. COPY pyproject.toml uv.lock # Changes rarely
2. RUN uv sync # Cached if deps unchanged
3. COPY apps/shared # Changes occasionally
4. COPY apps/api # Changes frequently
5. RUN final setup # Always runs
Image Sizes
| Container | Strategy | Size |
|---|---|---|
| API | Multi-stage + slim base | 1.29GB |
| UI | Multi-stage + nginx:alpine | 82MB |
| MCP Servers | Multi-stage + slim base | 338MB each |
Total: ~2.3GB for all 5 containers
Build Performance
First build: 5-10 minutes (downloading dependencies)
Cached build: 1-2 minutes (only changed layers)
No-cache build: 5-10 minutes (forced rebuild)
Docker Compose Configuration
version: '3.8'
services:
mcp-application-verification:
build:
context: .
dockerfile: apps/mcp_servers/application_verification/Dockerfile
ports:
- "8010:8010"
environment:
- AZURE_TENANT_ID
- AZURE_CLIENT_ID
- AZURE_CLIENT_SECRET
healthcheck:
test: ["CMD", "/healthcheck.sh"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
Benefits:
- Single command startup: docker-compose up
- Environment variable injection from .env
- Service orchestration and dependencies
- Easy local integration testing
Consequences
Positive
✅ Production-ready: Containers mirror Azure deployment
✅ Security-first: Non-root users, minimal images
✅ Fast builds: Multi-stage builds with layer caching
✅ Health monitoring: Comprehensive health checks
✅ Local testing: Full stack in Docker Compose
✅ Azure-ready: Direct mapping to Container Apps
Negative
⚠️ Build complexity: Multi-stage builds more complex
⚠️ Auth complexity: Different auth per environment
⚠️ Image size: Could be optimized further (Alpine Python)
⚠️ Startup time: 40s health check start period
Mitigations
- Comprehensive documentation for build process
- Clear authentication guides per environment
- Automated build testing in CI/CD
- Health check tuning based on actual metrics
Testing Strategy
Local Docker Testing
# Build all images
docker-compose build
# Start all services
docker-compose up -d
# Check health
docker-compose ps
# View logs
docker-compose logs -f api
# Test application
curl http://localhost:8080
CI/CD Testing
GitHub Actions: 1. Build all Docker images 2. Run docker-compose up 3. Wait for health checks 4. Run integration tests 5. Push images to registry (if tests pass)
Azure Container Apps Deployment
Mapping:
Docker Compose → Azure Container Apps
────────────────────────────────────────────────
service: api → Container App: api
port: 8000 - Internal ingress
service: ui → Container App: ui
port: 8080 - External ingress (HTTPS)
service: mcp-* → Container Apps: mcp-*
ports: 8010-8012 - Internal ingress only
Key differences:
- Azure handles TLS/SSL (not in containers)
- Managed Identity replaces Service Principal
- Internal DNS: app-name.internal.{env}.azurecontainerapps.io
- Auto-scaling based on HTTP requests
References
- ADR-034: Apps Folder Reorganization
- ADR-009: Azure Container Apps Deployment
- ADR-008: Azure Authentication for Local Docker Testing
- Docker Development Guide
Status
Implemented: October 2024
Current State: All 5 services containerized and deployed
Docker Images: Built and tested
Azure Deployment: Validated in dev environment
Future Enhancements
- Image optimization: Explore Alpine Python for smaller images
- Build caching: Implement remote build cache for CI/CD
- Health check tuning: Reduce start period based on metrics
- Multi-architecture: Support ARM64 for M1/M2 Macs