Skip to content

Azure Deployment Architecture

⚠️ HISTORICAL REFERENCE: This document describes the Container Apps architecture before migration to ACI.

Current implementation (as of 2025-10-24): - Compute: Azure Container Instances (ACI) - see ADR-049 - Layer names: Foundation, Substrate, AI Models, Apps - see ADR-052 - Dev access: Azure Bastion + Jump Box VM - see ADR-050 - Secrets: Key Vault for deployment outputs - see ADR-048 - Identity: User-assigned Managed Identity - see ADR-047

For current deployment, see: - Getting Started: Azure Deployment - 4-Layer Deployment Cake (visual guide with current architecture)

Complete technical architecture for Loan Defenders Azure deployment (Historical - Container Apps era)


Table of Contents

  1. Overview
  2. System Architecture
  3. High-Level Architecture
  4. Component Breakdown
  5. 4-Layer Deployment Architecture
  6. Deployment Layers Overview
  7. Layer 1: Foundation Infrastructure
  8. Layer 2: Container Platform
  9. Layer 3: Application Services
  10. Layer 4: AI Models
  11. Dual Deployment Paths
  12. Deployment Path Architecture
  13. Path 1: Direct Azure Deployment
  14. Path 2: GitHub Actions CI/CD
  15. Network Architecture
  16. VNet Design
  17. CIDR Allocation Strategy
  18. Network Security Groups (NSGs)
  19. VPN Gateway (Dev Environment)
  20. Security Architecture
  21. Zero Trust Security Model
  22. Security Features
  23. Container Apps Architecture
  24. Auto-Scaling Configuration
  25. Container App Configuration
  26. AI Services
  27. AI Foundry Integration
  28. Model Deployment
  29. Monitoring & Observability
  30. Monitoring Architecture
  31. Key Metrics Tracked
  32. Deployment Process Flow
  33. Cost Analysis
  34. Infrastructure as Code (IaC)
  35. Environments
  36. Best Practices
  37. Troubleshooting
  38. Related Documentation

Overview

Loan Defenders deploys to Azure using a 4-layer deployment architecture with production-grade security, auto-scaling, and monitoring. This document provides comprehensive technical details of the deployment architecture.

Related Documents: - Getting Started: Azure Deployment - Choose your deployment path - Direct Azure Deployment - Deploy using scripts and Bicep - GitHub CI/CD Deployment - Deploy using GitHub Actions

Architecture Decisions: - ADR-041: 4-Layer Deployment Architecture - ADR-042: Dual Deployment Paths - ADR-009: Container Apps Deployment - ADR-039: MCP Servers Deployment


System Architecture

High-Level Architecture

graph TB
    subgraph "Internet"
        Users[End Users]
    end

    subgraph "Azure Cloud - VNet (10.0.0.0/16)"
        subgraph "Public Subnet (10.0.1.0/24)"
            UI[ACI Container Group<br/>UI, API, 3 MCP Servers]
        end

        subgraph "MCP Subnet (10.0.3.0/24)"
            MCP1[MCP: Application Verification<br/>Port 8010]
            MCP2[MCP: Document Processing<br/>Port 8011]
            MCP3[MCP: Financial Calculations<br/>Port 8012]
        end

        subgraph "AI Subnet (10.0.4.0/24)"
            AI[Azure AI Services<br/>Private Endpoint Only]
        end

        subgraph "Infrastructure Subnet (10.0.0.0/24)"
            ACR[Azure Container Registry<br/>Private]
            MI[Managed Identities]
            KV[Key Vault]
        end

        subgraph "Monitoring"
            AppInsights[Application Insights]
            LogAnalytics[Log Analytics]
        end
    end

    Users -->|HTTPS| UI
    UI -->|Internal HTTPS| API
    API -->|Internal HTTPS| MCP1
    API -->|Internal HTTPS| MCP2
    API -->|Internal HTTPS| MCP3
    API -->|Private Link| AI
    MCP1 -->|Private Link| AI
    MCP2 -->|Private Link| AI
    MCP3 -->|Private Link| AI

    UI -.->|Telemetry| AppInsights
    API -.->|Telemetry| AppInsights
    MCP1 -.->|Telemetry| AppInsights
    MCP2 -.->|Telemetry| AppInsights
    MCP3 -.->|Telemetry| AppInsights
    AppInsights -.->|Logs| LogAnalytics

    ACR -->|Pull Images| UI
    ACR -->|Pull Images| API
    ACR -->|Pull Images| MCP1
    ACR -->|Pull Images| MCP2
    ACR -->|Pull Images| MCP3

    MI -->|Identity| UI
    MI -->|Identity| API
    MI -->|Identity| MCP1
    MI -->|Identity| MCP2
    MI -->|Identity| MCP3

    style UI fill:#4CAF50
    style API fill:#2196F3
    style MCP1 fill:#FF9800
    style MCP2 fill:#FF9800
    style MCP3 fill:#FF9800
    style AI fill:#9C27B0

Component Breakdown

Component Type Access Purpose
ACI Container Group Azure Container Instances Public (UI on port 80) Single container group with UI, API, 3 MCP servers
UI Container React Frontend Public (HTTPS) User interface for loan applications
API Container FastAPI Backend Internal (localhost) Business logic and orchestration
MCP: Application Verification FastAPI Internal Only Identity, employment, credit verification
MCP: Document Processing FastAPI Internal Only OCR, classification, data extraction
MCP: Financial Calculations FastAPI Internal Only DTI, affordability, risk scoring
Azure AI Services Managed Service Private Endpoint GPT-4.1/GPT-5 or newer models
Azure Container Registry Container Registry Private Container image storage
Application Insights Monitoring Azure Backbone Telemetry and diagnostics
Log Analytics Logging Azure Backbone Centralized log storage

4-Layer Deployment Architecture

The deployment is structured in 4 independent layers for modularity, speed, and maintainability.

Deployment Layers Overview

graph TB
    subgraph "Foundation Layer"
        L1[Networking + Security + Bastion + Key Vault<br/>Frequency: Rarely<br/>Time: 10-15 min]
    end

    subgraph "Substrate Layer"
        L2[Container Registry + AI Foundry Hub & Project<br/>Frequency: Occasionally<br/>Time: 5-7 min]
    end

    subgraph "AI Models Layer"
        L4[AI Model Deployments<br/>Frequency: As Needed<br/>Time: 3-5 min]
    end

    subgraph "Apps Layer"
        L3[ACI Container Group (UI, API, 3 MCP servers)<br/>Frequency: Very Often<br/>Time: 1-2 min]
    end

    L1 --> L2
    L2 --> L4
    L2 --> L3
    L4 --> L3

    style L1 fill:#E3F2FD
    style L2 fill:#F3E5F5
    style L3 fill:#E8F5E9
    style L4 fill:#FFF3E0

Foundation Layer

Resources Deployed: - Virtual Network (VNet) with subnets - Network Security Groups (NSGs) - Azure Bastion for browser-based VNet access (replaces VPN Gateway) - Windows Server 2022 Jump Box VM - User-Assigned Managed Identity (shared by all containers) - Key Vault (for deployment outputs storage) - Application Insights & Log Analytics

Deployment Frequency: Rarely (infrastructure changes/upgrades)

Deployment Time: 10-15 minutes

Bicep Modules: - infrastructure/bicep/modules/networking.bicep - infrastructure/bicep/modules/security.bicep - infrastructure/bicep/modules/bastion-vm.bicep

Scripts: - Direct: infrastructure/scripts/deploy-foundation.sh - GitHub: .github/workflows/deploy-foundation.yml

Key Changes (ADR-050, ADR-052): - Azure Bastion replaces VPN Gateway (~$150/month savings) - AI Foundry moved to Substrate layer for logical consistency - Key Vault added for deployment output storage (ADR-048)

See: ADR-052: Layer Renaming


Substrate Layer

Resources Deployed: - Azure Container Registry (ACR) - AI Foundry Hub - AI Foundry Project with Private Endpoint

Deployment Frequency: Occasionally (platform changes)

Deployment Time: 5-7 minutes

Bicep Modules: - infrastructure/bicep/modules/container-registry.bicep - infrastructure/bicep/modules/ai-foundry.bicep

Scripts: - Direct: infrastructure/scripts/deploy-substrate.sh - GitHub: .github/workflows/deploy-substrate.yml

Dependencies: Foundation (requires VNet, managed identities)

Key Changes (ADR-052): - AI Foundry moved from Foundation to Substrate for logical grouping with ACR - Both are platform services that applications consume


AI Models Layer

Resources Deployed: - AI model deployments via Azure AI Foundry - Model configuration (GPT-4o, GPT-4o-mini, or newer)

Deployment Frequency: As needed (model updates, capacity changes)

Deployment Time: 3-5 minutes

Bicep Modules: - Deployed via Azure AI Foundry (not Bicep)

Scripts: - Direct: Manual via Azure AI Foundry portal or script - GitHub: .github/workflows/deploy-ai-models.yml

Dependencies: Substrate (requires AI Foundry Project)

See: ADR-033: AI Models Deployment Automation


Apps Layer

Resources Deployed: - Azure Container Instances (ACI) - Single container group containing: - UI container (port 80, public access) - API container (port 8000, internal) - MCP Application Verification (port 8010, internal) - MCP Document Processing (port 8011, internal) - MCP Financial Calculations (port 8012, internal)

Deployment Frequency: Very often (every code change)

Deployment Time: 1-2 minutes

Bicep Modules: - infrastructure/bicep/modules/container-group-aci.bicep

Scripts: - Direct: infrastructure/scripts/deploy-apps.sh - GitHub: .github/workflows/deploy-apps.yml

Dependencies: Substrate (requires ACR, AI Foundry), AI Models (requires model deployment)

Key Changes (ADR-049): - Using ACI instead of Container Apps for simpler deployment - Single container group with localhost communication between containers - Faster deployment (1-2 min vs 5-8 min) - Lower cost (~$95/month vs $115/month for Container Apps)

See: - ADR-049: ACI vs Container Apps - ADR-047: Layer-Specific RBAC


Dual Deployment Paths

The architecture supports two deployment paths using the same Bicep modules to ensure consistency.

Deployment Path Architecture

graph TB
    subgraph "Infrastructure as Code (Single Source of Truth)"
        Bicep[infrastructure/bicep/modules/*.bicep]
    end

    subgraph "Path 1: Direct Azure Deployment"
        Scripts[Bash/PowerShell Scripts<br/>infrastructure/scripts/*.sh]
        AzCLI[Azure CLI + Bicep CLI]
        Local[Local Machine / Manual]
    end

    subgraph "Path 2: GitHub Actions CI/CD"
        Workflows[GitHub Actions Workflows<br/>.github/workflows/*.yml]
        OIDC[OIDC Authentication]
        Automated[Automated / On-Demand]
    end

    Bicep --> Scripts
    Bicep --> Workflows

    Scripts --> AzCLI --> Local
    Workflows --> OIDC --> Automated

    style Bicep fill:#4CAF50
    style Scripts fill:#2196F3
    style Workflows fill:#FF9800

Path 1: Direct Azure Deployment

Use Cases: - Quick evaluation and testing - Development environments - No GitHub account required - Manual deployments preferred

Authentication: Azure CLI (az login)

Characteristics: - ✅ Faster initial setup (no OIDC) - ✅ Full control via command line - ✅ No GitHub dependencies - ❌ Manual updates only - ❌ No deployment history

Deployment Tools: - Bash/PowerShell scripts orchestrate Bicep deployments - Azure CLI for authentication and deployment - Direct calls to Bicep modules for each layer

See: Direct Azure Deployment Guide for complete step-by-step instructions, script usage, and manual Bicep deployment.


Path 2: GitHub Actions CI/CD

Use Cases: - Production deployments - Team collaboration - Automated deployments - Multi-environment management

Authentication: OIDC (Service Principal, passwordless)

Characteristics: - ✅ Automated deployments (optional) - ✅ Deployment history and rollback - ✅ OIDC passwordless auth - ✅ Multi-environment support - ⚠️ Requires OIDC setup (~15 min)

Deployment Tools: - GitHub Actions workflows orchestrate deployments - OIDC for secure, passwordless authentication - PowerShell scripts called by workflows - Same Bicep modules as direct deployment

See: GitHub CI/CD Deployment Guide for OIDC setup, workflow configuration, and automated deployment procedures.

Architecture Decision: ADR-042: Dual Deployment Paths


Network Architecture

VNet Design

graph TB
    subgraph "Virtual Network: 10.0.0.0/16"
        subgraph "Infrastructure Subnet: 10.0.0.0/24"
            ACR[Azure Container Registry]
            MI[Managed Identities]
            KV[Key Vault]
        end

        subgraph "Public Subnet: 10.0.1.0/24"
            UI[UI Container App<br/>Public Ingress]
        end

        subgraph "Application Subnet: 10.0.2.0/24"
            API[API Container App<br/>Internal Ingress]
        end

        subgraph "MCP Subnet: 10.0.3.0/24"
            MCP1[MCP: App Verification]
            MCP2[MCP: Doc Processing]
            MCP3[MCP: Calculations]
        end

        subgraph "AI Subnet: 10.0.4.0/24"
            PE[Private Endpoint]
        end

        subgraph "Gateway Subnet: 10.0.5.0/27"
            VPN[VPN Gateway<br/>Optional]
        end
    end

    subgraph "Azure AI Services (Private)"
        AI[Azure AI Services]
    end

    Internet[Internet] -->|HTTPS| UI
    UI -->|Internal| API
    API -->|Internal| MCP1
    API -->|Internal| MCP2
    API -->|Internal| MCP3
    API -->|Private Link| PE
    MCP1 -->|Private Link| PE
    MCP2 -->|Private Link| PE
    MCP3 -->|Private Link| PE
    PE -->|Private| AI
    VPN -.->|Dev Access| Infrastructure

    style UI fill:#4CAF50
    style API fill:#2196F3
    style MCP1 fill:#FF9800
    style MCP2 fill:#FF9800
    style MCP3 fill:#FF9800
    style AI fill:#9C27B0
    style VPN fill:#9E9E9E,stroke-dasharray: 5 5

Network Security Groups (NSGs)

Public Subnet NSG (UI):

Inbound Rules:
- Allow HTTPS (443) from Internet
- Allow HTTP (80) from Internet (redirect to HTTPS)
- Deny all other inbound

Outbound Rules:
- Allow to Application Subnet (API)
- Allow to Azure backbone (monitoring)
- Deny all other outbound

Application Subnet NSG (API):

Inbound Rules:
- Allow HTTPS from Public Subnet (UI)
- Deny all other inbound

Outbound Rules:
- Allow to MCP Subnet
- Allow to AI Subnet (private endpoint)
- Allow to Azure backbone (monitoring)
- Deny all other outbound

MCP Subnet NSG:

Inbound Rules:
- Allow HTTPS from Application Subnet (API)
- Deny all other inbound

Outbound Rules:
- Allow to AI Subnet (private endpoint)
- Allow to Azure backbone (monitoring)
- Deny all other outbound

AI Subnet NSG:

Inbound Rules:
- Allow from Application Subnet
- Allow from MCP Subnet
- Deny all other inbound

Outbound Rules:
- Deny all (private endpoint only)


Security Architecture

Zero Trust Security Model

graph TB
    subgraph "Security Layers"
        Identity[Layer 1: Identity<br/>Managed Identities + RBAC]
        Network[Layer 2: Network<br/>Private VNet + NSGs]
        Application[Layer 3: Application<br/>Internal-only APIs]
        Data[Layer 4: Data<br/>Encryption in transit + at rest]
    end

    subgraph "Authentication Flow"
        Request[Incoming Request]
        WAF[Web Application Firewall<br/>Optional]
        TLS[TLS Termination]
        Auth[Entra ID Auth<br/>Optional for MCP]
        MI_Auth[Managed Identity Auth]
        Service[Service Execution]
    end

    Request --> WAF
    WAF --> TLS
    TLS --> Auth
    Auth --> MI_Auth
    MI_Auth --> Service

    Identity --> MI_Auth
    Network --> TLS
    Application --> Auth
    Data --> Service

    style Identity fill:#4CAF50
    style Network fill:#2196F3
    style Application fill:#FF9800
    style Data fill:#9C27B0

Security Features

Identity & Access Management: - ✅ Managed Identities: No passwords or secrets in code - ✅ RBAC: Least-privilege access control - ✅ Key Vault: Secure secret storage - ✅ Entra ID (Azure AD): Optional OAuth2 for MCP servers

Network Security: - ✅ Private VNet: Isolated network (10.0.0.0/16) - ✅ NSGs: Firewall rules per subnet - ✅ Private Endpoints: AI Services not exposed to internet - ✅ Internal Ingress: API and MCP servers internal-only - ✅ VPN Gateway: Optional secure developer access

Application Security: - ✅ HTTPS Only: TLS 1.2+ for all communication - ✅ No Public APIs: Only UI is public-facing - ✅ Health Checks: Automatic restart on failures - ✅ Container Scanning: ACR vulnerability scanning

Data Protection: - ✅ Encryption in Transit: HTTPS/TLS everywhere - ✅ Encryption at Rest: Azure Storage encryption - ✅ Private AI Access: No public internet for AI services - ✅ Audit Logging: Application Insights telemetry

See: MCP Deployment Security


ACI Container Architecture

Container Group Design

ACI Container Group - Single deployment unit containing all 5 containers with localhost communication:

graph TB
    subgraph "ACI Container Group"
        UI[UI Container<br/>Port 80]
        API[API Container<br/>Port 8000]
        MCP1[MCP Verification<br/>Port 8010]
        MCP2[MCP Documents<br/>Port 8011]
        MCP3[MCP Financial<br/>Port 8012]
    end

    Internet[Internet] -->|HTTPS| UI
    UI -->|localhost| API
    API -->|localhost| MCP1
    API -->|localhost| MCP2
    API -->|localhost| MCP3

    API -->|Private Endpoint| AIFoundry[AI Foundry]
    MCP1 -->|Private Endpoint| AIFoundry
    MCP2 -->|Private Endpoint| AIFoundry
    MCP3 -->|Private Endpoint| AIFoundry

    style UI fill:#4CAF50
    style API fill:#2196F3
    style MCP1 fill:#FF9800
    style MCP2 fill:#FF9800
    style MCP3 fill:#FF9800
    style AIFoundry fill:#9C27B0

Key Benefits (ADR-049): - Single container group = simpler deployment - Localhost communication = faster, no network overhead - VNet integration for private endpoint access - Faster startup compared to Container Apps - Lower cost (~$95/month vs ~$115/month for Container Apps)

Container Configuration

ACI Container Group:

Resource: Azure Container Instances (ACI)
Container Group: ldfdev-aci
Network Profile: VNet integrated
DNS Name: ldfdev-aci.eastus.azurecontainer.io

Containers:
  ui:
    Image: ACR/loan-defenders-ui:latest
    CPU: 0.5 cores (dev) | 1-2 cores (prod)
    Memory: 1 GB (dev) | 2-4 GB (prod)
    Ports: [80]
    Environment Variables:
      - REACT_APP_API_URL: http://localhost:8000

  api:
    Image: ACR/loan-defenders-api:latest
    CPU: 0.5 cores (dev) | 1-2 cores (prod)
    Memory: 1 GB (dev) | 2-4 GB (prod)
    Ports: [8000]
    Environment Variables:
      - MCP_APP_VERIFICATION_URL: http://localhost:8010
      - MCP_DOC_PROCESSING_URL: http://localhost:8011
      - MCP_FINANCIAL_CALC_URL: http://localhost:8012
      - AZURE_OPENAI_ENDPOINT: (from managed identity)

  mcp-app-verification:
    Image: ACR/mcp-application-verification:latest
    CPU: 0.25 cores (dev) | 0.5-1 cores (prod)
    Memory: 0.5 GB (dev) | 1-2 GB (prod)
    Ports: [8010]
    Environment Variables:
      - AZURE_OPENAI_ENDPOINT: (from managed identity)
      - MCP_SERVER_NAME: application-verification

  mcp-document-processing:
    Image: ACR/mcp-document-processing:latest
    CPU: 0.25 cores (dev) | 0.5-1 cores (prod)
    Memory: 0.5 GB (dev) | 1-2 GB (prod)
    Ports: [8011]
    Environment Variables:
      - AZURE_OPENAI_ENDPOINT: (from managed identity)
      - MCP_SERVER_NAME: document-processing

  mcp-financial-calculations:
    Image: ACR/mcp-financial-calculations:latest
    CPU: 0.25 cores (dev) | 0.5-1 cores (prod)
    Memory: 0.5 GB (dev) | 1-2 GB (prod)
    Ports: [8012]
    Environment Variables:
      - AZURE_OPENAI_ENDPOINT: (from managed identity)
      - MCP_SERVER_NAME: financial-calculations

Total Resources (Dev):
  CPU: 2 cores
  Memory: 3.5 GB
  Cost: ~$95/month

Total Resources (Prod):
  CPU: 4-8 cores
  Memory: 7-14 GB
  Cost: ~$300-450/month

See: ADR-049: Container Deployment - ACI vs Container Apps


Monitoring & Observability

Monitoring Architecture

graph TB
    subgraph "Application Telemetry"
        UI[UI Telemetry]
        API[API Telemetry]
        MCP1[MCP 1 Telemetry]
        MCP2[MCP 2 Telemetry]
        MCP3[MCP 3 Telemetry]
    end

    subgraph "Azure Monitor"
        AppInsights[Application Insights]
        LogAnalytics[Log Analytics Workspace]
        Metrics[Azure Metrics]
    end

    subgraph "Visualization & Alerts"
        Dashboards[Azure Dashboards]
        Workbooks[Azure Workbooks]
        Alerts[Alert Rules]
        Email[Email Notifications]
    end

    UI --> AppInsights
    API --> AppInsights
    MCP1 --> AppInsights
    MCP2 --> AppInsights
    MCP3 --> AppInsights

    AppInsights --> LogAnalytics
    AppInsights --> Metrics

    LogAnalytics --> Dashboards
    Metrics --> Dashboards
    LogAnalytics --> Workbooks
    Metrics --> Alerts
    Alerts --> Email

    style AppInsights fill:#4CAF50
    style LogAnalytics fill:#2196F3
    style Dashboards fill:#FF9800

Key Metrics Tracked

Performance Metrics: - Request duration (P50, P95, P99) - Request rate (requests/second) - Error rate (percentage) - CPU and memory usage per container - Container start time

Business Metrics: - Loan applications processed - Agent processing time - MCP tool call latency - AI model response time - Decision approval rate

Security Metrics: - Failed authentication attempts - Unauthorized access attempts - SSL/TLS version compliance - Managed identity usage

Availability Metrics: - Container health check status - Replica count per service - Auto-scaling events - Container restart events

See: Monitoring Setup


Deployment Process Flow

Full Deployment Flow (First Time)

sequenceDiagram
    participant Dev as Developer
    participant Azure as Azure Cloud
    participant ACR as Container Registry
    participant Apps as ACI Container Group

    Note over Dev,Apps: Layer 1: Foundation (15-30 min)
    Dev->>Azure: Deploy networking.bicep
    Dev->>Azure: Deploy security.bicep
    Dev->>Azure: Deploy ai-services.bicep
    Azure-->>Dev: VNet, NSGs, AI Services ready

    Note over Dev,Apps: Layer 2: Platform (5-10 min)
    Dev->>Azure: Deploy container-platform.bicep
    Azure-->>Dev: ACR + Environment ready

    Note over Dev,Apps: Layer 3a: MCP Servers (5-8 min)
    Dev->>ACR: Build & push MCP images
    Dev->>Apps: Deploy container-apps-mcp-servers.bicep
    Apps-->>Dev: MCP Servers running

    Note over Dev,Apps: Layer 3b: UI + API (5-8 min)
    Dev->>ACR: Build & push UI/API images
    Dev->>Apps: Deploy container-app-ui.bicep
    Dev->>Apps: Deploy container-app-api.bicep
    Apps-->>Dev: UI + API running

    Note over Dev,Apps: Layer 4: AI Models (5-10 min)
    Dev->>Azure: Deploy AI models via AI Foundry
    Azure-->>Dev: Models ready

    Note over Dev,Apps: Total: 35-65 minutes

Incremental Update Flow (Applications Only)

sequenceDiagram
    participant Dev as Developer
    participant ACR as Container Registry
    participant Apps as ACI Container Group

    Note over Dev,Apps: Update Application Containers (3-5 min)
    Dev->>ACR: Build & push updated images
    Dev->>Apps: Deploy container-group-aci.bicep
    Apps-->>Dev: Updated ACI container group running

    Note over Dev,Apps: 90% faster than full deployment!

Cost Analysis

Development Environment Cost Breakdown

Resource Configuration Monthly Cost Annual Cost
ACI Container Group (5 containers) 0.5 CPU, 1 GB total $95-120 $1,140-1,440
Azure OpenAI Usage-based (varies by model) $20-100 $240-1,200
Application Insights Standard tier $10-15 $120-180
Log Analytics Pay-as-you-go (5 GB/month) $5-10 $60-120
Virtual Network Standard VNet + NSGs Minimal (~$5) ~$60
Private Endpoints 2 endpoints (AI Foundry, ACR) $15 $180
Container Registry Basic tier $5 $60
Storage Account Standard LRS (minimal) $1-3 $12-36
Azure Bastion Standard SKU $140 $1,680
Jump Box VM Standard_D2s_v3 (can stop when not in use) $70 $840
Total (dev environment) $366-483/month $4,372-5,796/year

Cost Optimization Tips: - Stop Jump Box VM when not in use (~$70/month savings) - Delete Bastion during extended breaks (~$140/month savings, redeploys in 10 min) - Use ACI instead of Container Apps (saves ~$20/month) - Delete environment when not in use (stop all billing) - Monitor usage with Azure Cost Management - Use reserved instances for production (30-70% savings)

Bastion vs VPN Gateway: - Bastion + VM: ~$210/month - VPN Gateway: ~$360/month - Savings: ~$150/month (ADR-050)

Production Environment Cost Estimate

Resource Configuration Monthly Cost Annual Cost
ACI Container Group (5 containers) 2-4 CPU, 4-8 GB, HA configuration $300-450 $3,600-5,400
Azure OpenAI Higher usage $200-500+ $2,400-6,000+
Application Insights Enterprise tier + retention $30-50 $360-600
Log Analytics 30 GB/month $20-40 $240-480
Virtual Network Standard VNet + NSGs Minimal (~$5) ~$60
Private Endpoints 2-3 endpoints $15-22.50 $180-270
Container Registry Standard tier $20 $240
Storage Account Standard LRS $5-10 $60-120
Azure Bastion Standard SKU $140 $1,680
Jump Box VM Standard_D2s_v3 (for break-glass access) $70 $840
Load Balancer Standard SKU (optional) $18-40 $216-480
Total Estimated $823-1,342/month $9,876-16,130/year

Note: Production costs vary significantly based on: - Traffic volume (more requests = more AI usage) - Container resource allocation and scaling - Log retention policy - Backup and disaster recovery requirements


Infrastructure as Code (IaC)

Bicep Module Structure

infrastructure/bicep/
├── foundation.bicep                    # Foundation layer orchestrator
├── substrate.bicep                     # Substrate layer orchestrator
├── ai-models.bicep                     # AI Models layer orchestrator
├── apps.bicep                          # Apps layer orchestrator
├── modules/
│   ├── networking.bicep                # Foundation: VNet, NSGs, subnets
│   ├── security.bicep                  # Foundation: Managed Identity, Key Vault
│   ├── bastion-vm.bicep                # Foundation: Bastion + Jump Box VM
│   ├── container-registry.bicep        # Substrate: ACR
│   ├── ai-foundry.bicep                # Substrate: AI Foundry Hub + Project
│   ├── ai-model-deployment.bicep       # AI Models: Model deployments
│   └── container-group-aci.bicep       # Apps: ACI container group
└── environments/
    ├── dev-foundation.parameters.json
    ├── dev-substrate.parameters.json
    ├── dev-ai-models.parameters.json
    ├── dev-apps.parameters.json
    ├── staging-*.parameters.json
    └── prod-*.parameters.json

Bicep Module Dependencies

graph TB
    Network[networking.bicep]
    Security[security.bicep]
    Bastion[bastion-vm.bicep]
    ACR[container-registry.bicep]
    AIFoundry[ai-foundry.bicep]
    AIModels[ai-model-deployment.bicep]
    ACI[container-group-aci.bicep]

    Network --> Bastion
    Network --> ACR
    Network --> AIFoundry
    Security --> ACR
    Security --> ACI
    ACR --> ACI
    AIFoundry --> AIModels
    AIModels --> ACI

    style Network fill:#E3F2FD
    style Security fill:#F3E5F5
    style Bastion fill:#FFF3E0
    style ACR fill:#E8F5E9
    style AIFoundry fill:#FCE4EC
    style AIModels fill:#FFF9C4
    style ACI fill:#C8E6C9
style Security fill:#F3E5F5 style AI fill:#FCE4EC style Platform fill:#FFF3E0 style MCP fill:#E8F5E9 style UI fill:#E8F5E9 style API fill:#E8F5E9
**Key Principles:**
- ✅ Bicep is the single source of truth (not scripts)
- ✅ Modules are reusable across deployment paths
- ✅ Parameters control environment-specific configuration
- ✅ Scripts orchestrate module deployment, not define infrastructure
- ✅ No duplication between direct and CI/CD paths

**See:** [ADR-021: Azure Verified Modules Adoption](../architecture/decisions/adr-021-azure-verified-modules-adoption.md)

---

## Environments

### Environment Configuration

The deployment supports multiple environments with different configurations:

**Development (dev):**
- Smaller resources (0.25-0.5 CPU, 0.5-1 GB)
- Single replicas
- Optional: No VPN Gateway
- Cost-optimized
- Fast iteration

**Staging (staging):**
- Medium resources (0.5-1 CPU, 1-2 GB)
- 1-3 replicas
- VPN Gateway included
- Production-like testing
- Pre-production validation

**Production (prod):**
- Large resources (1-2 CPU, 2-4 GB)
- 2-10 replicas
- VPN Gateway included
- High availability
- Full monitoring and alerting

### Environment Parameters

Environment-specific configuration is managed via Bicep parameter files:

```bicep
// infrastructure/bicep/parameters/dev.bicepparam
using '../main-avm.bicep'

param environment = 'dev'
param location = 'eastus'
param vnetAddressPrefix = '10.0.0.0/16'
param containerAppCpu = '0.5'
param containerAppMemory = '1Gi'
param minReplicas = 1
param maxReplicas = 3
param deployVpnGateway = false  // Cost optimization

See: ADR-037: Environment-Based Configuration


Deployment Guides

Architecture Details

Operational Guides

Architecture Decisions


Next Steps: - → Choose Your Deployment Path - → Deploy Directly to Azure - → Deploy with GitHub CI/CD