Skip to content

ADR-035: Container Deployment Strategy

Status: Accepted
Date: 2024-10-15
Deciders: Architecture Team, DevOps Team
Related: ADR-034 (Apps Folder Reorganization), ADR-009 (Azure Container Apps)

Context

The Loan Defenders system needed a production-ready containerization strategy that: - Supports local development testing - Enables Azure Container Apps deployment - Maintains security best practices - Optimizes for build speed and image size - Provides proper health checking and monitoring

Initial challenges: - Mixed development and production container needs - Import path issues in containerized Python code - Authentication complexity (local vs production) - Service networking and discovery - Health check configuration

Decision

Implement a comprehensive container strategy with: 1. Multi-stage Docker builds for all services 2. Security-first design with non-root users 3. Proper health checks for all containers 4. Service mesh networking via Docker Compose 5. Azure Container Apps deployment readiness

Container Architecture

5 Containerized Services:
┌─────────────────────────────────────────────────────────────┐
│ API Container (Python 3.11-slim)                            │
│ - FastAPI backend + Agent orchestration                     │
│ - Port 8000                                                  │
│ - Health: /health endpoint                                   │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ UI Container (Nginx Alpine)                                 │
│ - React SPA served by Nginx                                 │
│ - Port 8080                                                  │
│ - Health: wget spider check                                  │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ MCP Server Containers (x3) (Python 3.11-slim)              │
│ - application_verification (Port 8010)                      │
│ - document_processing (Port 8011)                           │
│ - financial_calculations (Port 8012)                        │
│ - Health: /health endpoint                                   │
└─────────────────────────────────────────────────────────────┘

Implementation Details

1. Multi-Stage Builds

Python Services (API + MCP Servers):

FROM python:3.11-slim AS builder
# Install dependencies with uv (10-100x faster than pip)
RUN pip install uv
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev

FROM python:3.11-slim
# Copy only runtime dependencies
COPY --from=builder /.venv /.venv
# Application code
COPY app/ /app/

Benefits: - Smaller final image (no build tools) - Faster builds with layer caching - Reproducible with locked dependencies

UI Service:

FROM node:20-alpine AS builder
# Build React app
RUN npm ci && npm run build

FROM nginx:alpine
# Copy only built static files
COPY --from=builder /app/dist /usr/share/nginx/html

Benefits: - 82MB final image (vs 1GB+ with Node) - Fast serving with Nginx - Minimal attack surface

2. Security Hardening

Non-Root Users:

# Create dedicated user (UID 1000)
RUN useradd -m -u 1000 apiuser
USER apiuser

Why UID 1000? - Matches common developer machine UIDs - Prevents root access in container - Required for security compliance

Applied to all containers: - API: apiuser:apiuser - UI: uiuser:uiuser - MCP Servers: mcpuser:mcpuser

3. Health Checks

Configuration:

HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

Parameters: - start-period: 40s - Allow Python import and server startup - interval: 30s - Check every 30 seconds - timeout: 10s - Reasonable for HTTP check - retries: 3 - Grace period for transient issues

Why 40s start period? - Python imports take ~10-15s - Azure AI client initialization ~10-15s - Agent framework loading ~10-15s - Buffer for container orchestrator

4. Service Networking

Docker Compose:

services:
  api:
    environment:
      MCP_APPLICATION_VERIFICATION_URL: http://mcp-application-verification:8010/mcp
      MCP_DOCUMENT_PROCESSING_URL: http://mcp-document-processing:8011/mcp
      MCP_FINANCIAL_CALCULATIONS_URL: http://mcp-financial-calculations:8012/mcp

Key decisions: - DNS-based service discovery (by service name) - Internal network (no external exposure for MCP servers) - /mcp endpoint path required by agent framework

Azure Container Apps mapping: - Same DNS pattern with .internal. suffix - Internal ingress for MCP servers - External ingress only for UI

5. Authentication Strategy

Local Development (az login): - Not available in containers - Solution: Service Principal with credentials in .env

Docker Testing:

AZURE_TENANT_ID=xxx
AZURE_CLIENT_ID=xxx
AZURE_CLIENT_SECRET=xxx

Azure Production: - Managed Identity (no credentials needed) - Container Apps system-assigned identity - RBAC: "Cognitive Services OpenAI User"

Build Optimization

Layer Caching Strategy

Order matters:

1. COPY pyproject.toml uv.lock     # Changes rarely
2. RUN uv sync                      # Cached if deps unchanged
3. COPY apps/shared                 # Changes occasionally  
4. COPY apps/api                    # Changes frequently
5. RUN final setup                  # Always runs

Image Sizes

Container Strategy Size
API Multi-stage + slim base 1.29GB
UI Multi-stage + nginx:alpine 82MB
MCP Servers Multi-stage + slim base 338MB each

Total: ~2.3GB for all 5 containers

Build Performance

First build: 5-10 minutes (downloading dependencies)
Cached build: 1-2 minutes (only changed layers)
No-cache build: 5-10 minutes (forced rebuild)

Docker Compose Configuration

version: '3.8'
services:
  mcp-application-verification:
    build:
      context: .
      dockerfile: apps/mcp_servers/application_verification/Dockerfile
    ports:
      - "8010:8010"
    environment:
      - AZURE_TENANT_ID
      - AZURE_CLIENT_ID
      - AZURE_CLIENT_SECRET
    healthcheck:
      test: ["CMD", "/healthcheck.sh"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Benefits: - Single command startup: docker-compose up - Environment variable injection from .env - Service orchestration and dependencies - Easy local integration testing

Consequences

Positive

Production-ready: Containers mirror Azure deployment
Security-first: Non-root users, minimal images
Fast builds: Multi-stage builds with layer caching
Health monitoring: Comprehensive health checks
Local testing: Full stack in Docker Compose
Azure-ready: Direct mapping to Container Apps

Negative

⚠️ Build complexity: Multi-stage builds more complex
⚠️ Auth complexity: Different auth per environment
⚠️ Image size: Could be optimized further (Alpine Python)
⚠️ Startup time: 40s health check start period

Mitigations

  • Comprehensive documentation for build process
  • Clear authentication guides per environment
  • Automated build testing in CI/CD
  • Health check tuning based on actual metrics

Testing Strategy

Local Docker Testing

# Build all images
docker-compose build

# Start all services
docker-compose up -d

# Check health
docker-compose ps

# View logs
docker-compose logs -f api

# Test application
curl http://localhost:8080

CI/CD Testing

GitHub Actions: 1. Build all Docker images 2. Run docker-compose up 3. Wait for health checks 4. Run integration tests 5. Push images to registry (if tests pass)

Azure Container Apps Deployment

Mapping:

Docker Compose          → Azure Container Apps
────────────────────────────────────────────────
service: api            → Container App: api
  port: 8000              - Internal ingress

service: ui             → Container App: ui
  port: 8080              - External ingress (HTTPS)

service: mcp-*          → Container Apps: mcp-*
  ports: 8010-8012        - Internal ingress only

Key differences: - Azure handles TLS/SSL (not in containers) - Managed Identity replaces Service Principal - Internal DNS: app-name.internal.{env}.azurecontainerapps.io - Auto-scaling based on HTTP requests

References

Status

Implemented: October 2024
Current State: All 5 services containerized and deployed
Docker Images: Built and tested
Azure Deployment: Validated in dev environment

Future Enhancements

  • Image optimization: Explore Alpine Python for smaller images
  • Build caching: Implement remote build cache for CI/CD
  • Health check tuning: Reduce start period based on metrics
  • Multi-architecture: Support ARM64 for M1/M2 Macs