4-Layer Deployment Architecture - "The Layered Cake"

📝 Updated (2025-10-24): This guide now uses semantic layer names following ADR-052: - Foundation (formerly Layer 1) - Substrate (formerly Layer 2) - AI Models (formerly Layer 3) - Apps (formerly Layer 4)

Visual Concept: Each layer builds upon the previous, like a layered cake. Each layer has a specific purpose, deployment frequency, and time investment.

The Layered Cake Visualization

                    🍰 4-Layer Deployment Cake 🍰

    ╔═══════════════════════════════════════════════════════════╗
    ║                    🎂 APPS LAYER                          ║
    ║                     (1-2 minutes)                         ║
    ║  ┌─────────────────────────────────────────────────────┐  ║
    ║  │  ACI Container Group (Single Unit)                  │  ║
    ║  │  ├─ UI (port 80) - Public                           │  ║
    ║  │  ├─ API (port 8000) - Internal (localhost)          │  ║
    ║  │  ├─ MCP Verification (port 8010) - Internal         │  ║
    ║  │  ├─ MCP Documents (port 8011) - Internal            │  ║
    ║  │  └─ MCP Financial (port 8012) - Internal            │  ║
    ║  └─────────────────────────────────────────────────────┘  ║
    ║  Changes: VERY FREQUENTLY (daily/hourly) - CODE          ║
    ╚═══════════════════════════════════════════════════════════╝
                              ▲
                              │ Requires
                              │
    ╔═══════════════════════════════════════════════════════════╗
    ║               🧁 AI MODELS LAYER                          ║
    ║                   (3-5 minutes)                           ║
    ║  ┌──────────────────────────────────────────────────────┐║
    ║  │   gpt-4o         │  gpt-4o-mini   │   future models │║
    ║  │ 10K TPM          │  20K TPM       │   configurable  │║
    ║  │ GlobalStandard   │ Standard       │ GlobalStandard  │║
    ║  └──────────────────────────────────────────────────────┘║
    ║  Changes: AS NEEDED (weekly/monthly) - MODEL CONFIG      ║
    ╚═══════════════════════════════════════════════════════════╝
                              ▲
                              │ Requires
                              │
    ╔═══════════════════════════════════════════════════════════╗
    ║              🍪 SUBSTRATE LAYER                           ║
    ║                  (5-7 minutes)                            ║
    ║  ┌──────────────────────────────────────────────────────┐║
    ║  │  Azure Container Registry (ACR)                       │║
    ║  │  ├─ loan-defenders-ui:latest                          │║
    ║  │  ├─ loan-defenders-api:latest                         │║
    ║  │  └─ mcp-*:latest (3 servers)                          │║
    ║  │                                                        │║
    ║  │  AI Foundry Hub & Project                             │║
    ║  │  ├─ Hub: ldfdev-ai-foundry-hub                        │║
    ║  │  ├─ Project: ldfdev-ai-foundry-project                │║
    ║  │  └─ Private Endpoint (VNet integrated)                │║
    ║  └──────────────────────────────────────────────────────┘║
    ║  Changes: OCCASIONALLY (monthly) - PLATFORM              ║
    ╚═══════════════════════════════════════════════════════════╝
                              ▲
                              │ Requires
                              │
    ╔═══════════════════════════════════════════════════════════╗
    ║            🎂 FOUNDATION LAYER                            ║
    ║                  (10-15 minutes)                          ║
    ║  ┌──────────────────────────────────────────────────────┐║
    ║  │  Networking                                           │║
    ║  │  ├─ Virtual Network (ldfdev-vnet)                     │║
    ║  │  ├─ Subnets (compute, data, bastion)                  │║
    ║  │  └─ NSG Rules                                         │║
    ║  │                                                        │║
    ║  │  Developer Access (replaces VPN Gateway)              │║
    ║  │  ├─ Azure Bastion (browser-based RDP)                 │║
    ║  │  └─ Windows Server 2022 Jump Box VM                   │║
    ║  │                                                        │║
    ║  │  Security & Identity                                  │║
    ║  │  ├─ Key Vault (for deployment outputs)                │║
    ║  │  ├─ User-Assigned Managed Identity (shared)           │║
    ║  │  └─ RBAC Assignments                                  │║
    ║  │                                                        │║
    ║  │  Monitoring                                           │║
    ║  │  ├─ Log Analytics (ldfdev-logs)                       │║
    ║  │  └─ Application Insights (ldfdev-insights)            │║
    ║  └──────────────────────────────────────────────────────┘║
    ║  Changes: RARELY (quarterly/yearly) - INFRASTRUCTURE     ║
    ╚═══════════════════════════════════════════════════════════╝

Layer Characteristics

Layer	Name	Time	Frequency	What It Deploys	Why Separate?
Foundation	Foundation	10-15 min	Rarely (months)	VNet, Bastion + Jump Box VM, Security, Key Vault	Foundational infrastructure changes infrequently
Substrate	Platform Services	5-7 min	Occasionally (weeks)	ACR, AI Foundry Hub & Project	Platform services configuration changes periodically
AI Models	Model Deployments	3-5 min	As needed (days/weeks)	Model deployments, capacity	Models change for testing and optimization
Apps	Applications	1-2 min	Very Frequently (hours/days)	ACI Container Group (all 5 containers)	Code changes constantly during development

The Cake Analogy

🎂 Foundation Layer: The Cake Base

Like: The bottom layer and plate - provides stable foundation
Characteristics: Heavy, time-consuming to prepare, rarely changed
Azure: VNet, Bastion + Jump Box VM, Key Vault, Managed Identity
Time: 10-15 minutes (like baking a cake from scratch)
Change Frequency: Rarely - infrastructure is stable foundation
Key Change (ADR-050): Bastion replaces VPN Gateway (~$150/month savings)

🍪 Substrate Layer: The Frosting Layer

Like: Frosting between cake layers - provides smooth surface
Characteristics: Medium effort, provides platform for toppings
Azure: ACR (image storage), AI Foundry Hub & Project
Time: 5-7 minutes (like making frosting)
Change Frequency: Occasionally - platform updates needed sometimes
Key Change (ADR-052): AI Foundry moved here from Foundation for logical consistency

🧁 AI Models Layer: The Filling

Like: Fruit filling or cream - adds flavor variety
Characteristics: Quick to swap, experiment with different flavors
Azure: AI model deployments (GPT-4o, GPT-4o-mini, etc.)
Time: 3-5 minutes (like adding/changing filling)
Change Frequency: As needed - test different models for cost/quality

🍰 Apps Layer: The Decorations

Like: Cake decorations and toppings - frequently changed for occasions
Characteristics: Quickest to modify, most visible to users
Azure: ACI Container Group (UI, API, 3 MCP servers in single unit)
Time: 1-2 minutes (like adding new decorations)
Change Frequency: Very Frequently - code changes daily
Key Change (ADR-049): Using ACI for simpler deployment, localhost communication

Deployment Flow: Building the Cake

Initial Deployment (First Time)

# Step 1: Bake the cake base (Foundation)
./deploy-foundation.sh dev

# Step 2: Add the frosting (Substrate)
./deploy-substrate.sh dev

# Step 3: Add the filling (AI Models)
./deploy-ai-models.sh dev

# Step 4: Add the decorations (Apps)
./deploy-apps.sh dev

# Total: 20-30 minutes for complete deployment
# 🎂 Cake is ready to serve!

Daily Development (Code Changes)

# Change only Apps layer (decorations)
# No need to rebuild the entire cake!

./deploy-apps.sh dev

# 15x faster than redeploying everything!
# Before: 25+ minutes (all layers)
# After: 1-2 minutes (Apps only)

Model Testing (Cost Optimization)

# Change only AI Models layer (filling)
# Base cake (Foundation, Substrate) stays intact

# Test different model
./deploy-ai-models.sh dev  # Add gpt-4o-mini

# Switch apps to use new model
# Edit: dev-apps.parameters.json → "gpt-4o-mini"
./deploy-apps.sh dev

# Total: 5-7 minutes to test model change
# Can test 5 different models in 35 minutes!

Benefits of Layered Cake Architecture

1. Speed: Deploy Only What Changed

Full Cake (4-layer):     20-30 minutes
Decorations Only (Apps): 1-2 minutes   ⚡ 15x faster!
Filling Change (Models+Apps): 5-7 minutes   ⚡ 4x faster!

2. Risk: Smaller Changes = Lower Risk

Changing decorations (Apps): Low risk - easy to fix
Changing filling (Models):   Medium risk - affects AI behavior
Changing frosting (Substrate): Medium risk - affects platform
Changing base (Foundation):  High risk - infrastructure structural

3. Experimentation: Rapid Iteration

Test 5 different model configurations:
  Old way: 5 × 25 min = 125 minutes
  New way: 5 × 2 min = 10 minutes ⚡ 12.5x faster!

4. Cost Optimization: Rapid Testing

Deploy gpt-4o, gpt-4o-mini (AI Models: 5 min)
Test each with apps (Apps: 2 min × 2 = 4 min)
Total: 9 minutes to test 2 models
Result: Found 95% quality at 1/10th cost → $8K/month saved

5. Rollback: Quick Recovery

Issue in production?
Rollback Apps: 1-2 minutes
Rollback Models: 3-5 minutes
Rollback Substrate: 5-7 minutes
Rollback Foundation: 10-15 minutes

Most changes are Apps → fastest rollback!

Layer Dependencies

                    Apps Layer
                          ↓
                    Depends on all below
                          ↓
    ┌──────────────────────────────────────────┐
    │        AI Models Layer                   │
    │          ↓                                │
    │   Depends on Substrate                   │
    └──────────────────────────────────────────┘
                          ↓
    ┌──────────────────────────────────────────┐
    │    Substrate Layer                       │
    │          ↓                                │
    │   Depends on Foundation only             │
    └──────────────────────────────────────────┘
                          ↓
    ┌──────────────────────────────────────────┐
    │   Foundation Layer                       │
    │          ↓                                │
    │   No dependencies (the base!)            │
    └──────────────────────────────────────────┘

Validation: Each deployment script validates its dependencies exist before proceeding.

Key Changes (ADR-052): - AI Models depends on Substrate (AI Foundry moved from Foundation) - Apps depends on Substrate (ACR) and AI Models (deployments) - Foundation is truly foundational (no AI services here)

Deployment Time Comparison

Scenario 1: Fix API Bug

Before (3-layer):

Deploy All → 20-25 minutes

After (4-layer with ACI):

Deploy Apps Only → 1-2 minutes ⚡ 15x faster

Scenario 2: Test New AI Model

Before (3-layer):

Redeploy Foundation (includes AI Foundry) → 10-15 minutes
Redeploy Apps → 2-3 minutes
Total: 13-18 minutes

After (4-layer):

Deploy AI Models → 3-5 minutes
Deploy Apps → 1-2 minutes
Total: 5-7 minutes ⚡ 3x faster

Scenario 3: Scale Container Platform

Before (3-layer):

Redeploy Substrate → 5-7 minutes

After (4-layer):

Deploy Substrate → 5-7 minutes (same, but no Container Apps Environment complexity)

Scenario 4: Add Bastion/Jump Box

Before (3-layer with VPN):

Redeploy Foundation → 10-15 minutes (includes VPN Gateway setup)

After (4-layer with Bastion):

Deploy Foundation → 10-15 minutes (same time, but $150/month cheaper)

Key Insight: Most frequent changes (Apps code, AI Models) are now 3-15x faster!

Architecture Benefits (ADRs 049-052): - ACI deployment is simpler and faster than Container Apps - Bastion is easier to manage than VPN Gateway - AI Foundry in Substrate layer makes logical sense with ACR

When to Deploy Each Layer

Foundation: Foundation Infrastructure

Deploy when: - ✅ First time setup - ✅ Adding new VNet subnets - ✅ Changing AI Services configuration - ✅ Modifying Key Vault or security settings - ✅ Network topology changes

Don't deploy for: - ❌ Code changes - ❌ Model testing - ❌ Container scaling - ❌ Application configuration changes

Substrate: Container Platform

Deploy when: - ✅ First time setup - ✅ ACR configuration changes - ✅ Container Apps Environment scaling - ✅ Platform monitoring changes

Don't deploy for: - ❌ Code changes - ❌ Model testing - ❌ Application configuration changes

AI Models: AI Models

Deploy when: - ✅ First time setup - ✅ Adding new model versions - ✅ Changing model capacity (TPM) - ✅ Testing different models - ✅ Model upgrades

Don't deploy for: - ❌ Code changes - ❌ Application configuration changes - ✅ Can deploy to switch model, but Layer 4 parameter change is faster!

Apps: Applications

Deploy when: - ✅ First time setup - ✅ Code changes (API, UI, MCP) - ✅ Application configuration changes - ✅ Environment variable updates - ✅ Switching AI models (via parameters) - ✅ Container image updates

Most frequent deployment - this is where development happens!

Summary: The Power of Layers

Before (3-Layer)

Foundation: Everything (20-25 min)
Substrate: Container Apps Platform (5-7 min)
AI Models: Applications (2-3 min)

Problem: AI models bundled with infrastructure
Result: 10-15 min to test new model

After (4-Layer Cake with ADR Updates)

Foundation: Core Infrastructure (10-15 min)    🎂 Base - rarely changes
Substrate: Platform Services (5-7 min)         🍪 Frosting - occasionally
AI Models: Model Deployments (3-5 min)         🧁 Filling - as needed
Apps: ACI Container Group (1-2 min)            🍰 Decorations - constantly

Benefits:
- 1-2 min to deploy code changes (15x faster!)
- 5-7 min to test new model (3x faster!)
- Bastion instead of VPN (~$150/month savings)
- ACI instead of Container Apps (simpler, faster)
- AI Foundry logically grouped with ACR

The Metaphor

"You don't rebuild the entire cake when you want to change the decorations. You don't remake the frosting when you want to try a different filling. Each layer has a purpose, and you only change what needs changing."

This is the essence of the 4-Layer Deployment Architecture.

ADR References: - ADR-052: Layer Renaming & AI Foundry - ADR-051: Infrastructure Simplification - ADR-050: Bastion Replaces VPN Gateway - ADR-049: ACI vs Container Apps - ADR-048: Key Vault for Outputs - ADR-047: Layer-Specific RBAC

Quick Reference

Deployment Commands

# Full deployment (first time)
./deploy-foundation.sh dev  # 10-15 min
./deploy-substrate.sh dev   # 5-7 min
./deploy-ai-models.sh dev   # 3-5 min
./deploy-apps.sh dev        # 1-2 min

# Daily development (most common)
./deploy-apps.sh dev        # 1-2 min

# Model testing
./deploy-ai-models.sh dev   # 3-5 min (add models)
./deploy-apps.sh dev        # 1-2 min (switch apps to new model)

# Platform updates (rare)
./deploy-substrate.sh dev   # 5-7 min

# Infrastructure changes (very rare)
./deploy-foundation.sh dev  # 10-15 min

Layer Files

infrastructure/
├── bicep/
│   ├── foundation.bicep              (10-15 min)
│   ├── substrate.bicep               (5-7 min)
│   ├── ai-models.bicep               (3-5 min)
│   └── apps.bicep                    (1-2 min)
├── scripts/
│   ├── deploy-foundation.sh
│   ├── deploy-substrate.sh
│   ├── deploy-ai-models.sh
│   └── deploy-apps.sh
└── bicep/environments/
    ├── dev-foundation.parameters.json
    ├── dev-substrate.parameters.json
    ├── dev-ai-models.parameters.json
    └── dev-apps.parameters.json

Remember: The cake is built once, but decorated often. Layer your deployments to match your change frequency!

🎂 Happy Deployment! 🍰

Updated (2025-10-24) - Following ADRs 047-052 for semantic naming, Bastion access, ACI deployment, and AI Foundry reorganization.