ADR-033: AI Models Deployment Automation

Status: Accepted
Date: 2024-10-09
Deciders: Architecture Team, DevOps Team
Related: ADR-023 (PowerShell for Azure Deployments), ADR-021 (Azure Verified Modules)

Context

We need automated deployment of AI models to Azure AI Foundry (AI Services) to support the multi-agent loan processing system. The deployment must:

Deploy OpenAI models (GPT-4o) to Azure AI Services
Provide reliable endpoints for agent framework integration
Support multiple environments (dev, staging, prod)
Be idempotent and safe for repeated executions
Integrate with existing GitHub Actions CI/CD pipeline
Follow enterprise best practices for security and reliability

Decision

1. Deployment Architecture

Implemented automated AI model deployment using:

GitHub Actions Workflow: .github/workflows/deploy-ai-models.yml
Bicep Module: infrastructure/bicep/modules/ai-model-deployments.bicep
Parameter Files: infrastructure/bicep/environments/{env}-models.parameters.json
Manual Script: infrastructure/scripts/deploy-models.sh (for local testing)

2. Key Design Decisions

A. OIDC Authentication (Passwordless)

Uses id-token: write permission in GitHub Actions
Azure OIDC login with federated credentials
No secrets stored in repository or GitHub
Follows Microsoft security best practices

B. PowerShell Deployment (Per ADR-023)

Uses PowerShell for all Azure deployments
Avoids Azure CLI bugs with Bicep deployments
Better error messages and diagnostics

Pinned module versions for reliability:

Az.Accounts@3.0.4
Az.Resources@7.5.0
Az.CognitiveServices@1.14.1

C. Idempotent Bicep Module

resource aiServicesAccount 'Microsoft.CognitiveServices/accounts@2024-10-01' existing = {
  name: aiServicesName
}

resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2024-10-01' = [
  for deployment in modelDeployments: {
    parent: aiServicesAccount
    name: deployment.name
    // Updates if exists, creates if not
  }
]

Benefits: - Safe to run multiple times - Updates existing deployments - Creates new deployments as needed - No manual cleanup required

D. Environment-Specific Configuration

Resource Naming Pattern: ldf{environment}-{resource} - Dev: ldfdev-rg, ldfdev-ai - Staging: ldfstaging-rg, ldfstaging-ai - Prod: ldfprod-rg, ldfprod-ai

Parameter Files:

{
  "aiServicesName": { "value": "ldfdev-ai" },
  "modelDeployments": {
    "value": [
      {
        "name": "gpt-4o",
        "model": {
          "format": "OpenAI",
          "name": "gpt-4o",
          "version": "2024-08-06"
        },
        "sku": {
          "name": "GlobalStandard",
          "capacity": 10
        }
      }
    ]
  }
}

E. Model Selection

Default Model: GPT-4o (gpt-4o) - Version: 2024-08-06 (verified stable) - SKU: GlobalStandard - Capacity: 10K TPM (dev), scalable in prod

Rationale: - Latest stable GPT-4 model - Widely available across Azure regions - Supports function calling (required for agent framework) - Good balance of performance and cost

3. Deployment Workflow

GitHub Actions Trigger: - Manual dispatch via Actions UI - Input: environment selection (dev/staging/prod) - Input: optional model override (defaults to parameter file)

Workflow Steps: 1. Checkout repository 2. Azure OIDC login 3. Install PowerShell modules 4. Validate prerequisites (RG, AI Services exist) 5. Deploy models via Bicep 6. Display deployment results 7. Create GitHub step summary

Error Handling: - Validates all prerequisites before deployment - Fails fast with clear error messages - Shows detailed deployment errors - Proper exit codes for CI/CD integration

4. Application Integration

Endpoint Information Provided:

Base Endpoint: https://ldfdev-ai.cognitiveservices.azure.com/
Deployment Name: gpt-4o
API Version: 2024-12-01-preview

Full Chat Completions URL:
POST https://ldfdev-ai.cognitiveservices.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-12-01-preview

Environment Variables for Container Apps:

AZURE_OPENAI_ENDPOINT=https://ldfdev-ai.cognitiveservices.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o
AZURE_OPENAI_API_VERSION=2024-12-01-preview

Authentication: Managed Identity (DefaultAzureCredential) - No API keys in environment variables - Container Apps use system-assigned managed identity - RBAC role: "Cognitive Services OpenAI User"

5. Testing Strategy

Local Testing (before CI/CD):

cd infrastructure
./scripts/deploy-models.sh dev ldfdev-rg

CI/CD Testing: 1. Run workflow via GitHub Actions UI 2. Verify deployment in Azure Portal 3. Test endpoint with curl/SDK 4. Validate in application integration tests

Validation Checklist: - [ ] Model deploys successfully - [ ] Endpoint is accessible - [ ] Model responds to test queries - [ ] Managed identity has correct permissions - [ ] Environment variables set in Container Apps - [ ] Agent framework can use deployed model

Implementation Status

✅ Completed

GitHub Actions workflow created
Bicep module for model deployments
Parameter files for all environments
Manual deployment script
OIDC authentication configured
PowerShell deployment implementation
Error handling and validation
Deployment outputs and summaries

🔧 Configuration Fixes Applied

Fixed resource naming pattern (ldf{env} format)
Updated parameter files with correct AI Services names
Verified model availability (gpt-4o)
Added proper SKU configuration
Validated API versions

📝 Documentation Created

This ADR documenting the decision
Workflow documentation in comments
Script usage examples
Integration guide for application teams

Consequences

Positive

Automated Deployments: No manual model deployment via Portal
Consistency: Same process for all environments
Idempotent: Safe to run multiple times
Secure: OIDC authentication, no secrets in code
Reliable: PowerShell avoids Azure CLI bugs
Transparent: Clear outputs and error messages
Testable: Can test locally before CI/CD
Documented: Clear integration guide for apps

Negative

Initial Setup: Required OIDC configuration in Azure
PowerShell Dependency: Team needs PowerShell knowledge
Model Versioning: Must manually update versions in parameters
Regional Limits: Some models not available in all regions

Neutral

Manual Trigger: Deployment on-demand vs automatic
Parameter Files: Need to maintain per-environment configs
Cost: Model deployment costs charged by Azure

Alternatives Considered

1. Azure CLI Deployment

Rejected: Azure CLI has known bug with "content already consumed" errors in Bicep deployments (per ADR-023).

2. Terraform

Rejected: Team already standardized on Bicep for Azure infrastructure. Adding Terraform increases complexity.

3. Manual Portal Deployment

Rejected: Not repeatable, error-prone, no audit trail, not suitable for multiple environments.

4. Azure Bicep Registry Modules

Considered: Using AVM (Azure Verified Modules) for model deployments. Status: No official AVM module exists for model deployments yet. Custom module is appropriate.

Workflow: .github/workflows/deploy-ai-models.yml
Bicep Module: infrastructure/bicep/modules/ai-model-deployments.bicep
Parameter Files: infrastructure/bicep/environments/*-models.parameters.json
Deployment Script: infrastructure/scripts/deploy-models.sh
ADR-021: Azure Verified Modules Adoption
ADR-023: PowerShell for Azure Deployments

Maintenance

Regular Tasks

Review model versions quarterly (new GPT models released)
Update API versions as Azure releases new versions
Monitor model capacity and adjust as needed
Review costs monthly and optimize capacity

When to Update

New model versions released by OpenAI/Microsoft
API version deprecation notices
Capacity needs change (scaling up/down)
New models needed for additional features

Approval: Implemented and deployed to dev environment
Next Review: 2025-Q1 (quarterly model version review)