ADR-033: AI Models Deployment Automation
Status: Accepted
Date: 2024-10-09
Deciders: Architecture Team, DevOps Team
Related: ADR-023 (PowerShell for Azure Deployments), ADR-021 (Azure Verified Modules)
Context
We need automated deployment of AI models to Azure AI Foundry (AI Services) to support the multi-agent loan processing system. The deployment must:
- Deploy OpenAI models (GPT-4o) to Azure AI Services
- Provide reliable endpoints for agent framework integration
- Support multiple environments (dev, staging, prod)
- Be idempotent and safe for repeated executions
- Integrate with existing GitHub Actions CI/CD pipeline
- Follow enterprise best practices for security and reliability
Decision
1. Deployment Architecture
Implemented automated AI model deployment using:
- GitHub Actions Workflow:
.github/workflows/deploy-ai-models.yml - Bicep Module:
infrastructure/bicep/modules/ai-model-deployments.bicep - Parameter Files:
infrastructure/bicep/environments/{env}-models.parameters.json - Manual Script:
infrastructure/scripts/deploy-models.sh(for local testing)
2. Key Design Decisions
A. OIDC Authentication (Passwordless)
- Uses
id-token: writepermission in GitHub Actions - Azure OIDC login with federated credentials
- No secrets stored in repository or GitHub
- Follows Microsoft security best practices
B. PowerShell Deployment (Per ADR-023)
- Uses PowerShell for all Azure deployments
- Avoids Azure CLI bugs with Bicep deployments
- Better error messages and diagnostics
- Pinned module versions for reliability:
C. Idempotent Bicep Module
resource aiServicesAccount 'Microsoft.CognitiveServices/accounts@2024-10-01' existing = {
name: aiServicesName
}
resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2024-10-01' = [
for deployment in modelDeployments: {
parent: aiServicesAccount
name: deployment.name
// Updates if exists, creates if not
}
]
Benefits: - Safe to run multiple times - Updates existing deployments - Creates new deployments as needed - No manual cleanup required
D. Environment-Specific Configuration
Resource Naming Pattern: ldf{environment}-{resource}
- Dev: ldfdev-rg, ldfdev-ai
- Staging: ldfstaging-rg, ldfstaging-ai
- Prod: ldfprod-rg, ldfprod-ai
Parameter Files:
{
"aiServicesName": { "value": "ldfdev-ai" },
"modelDeployments": {
"value": [
{
"name": "gpt-4o",
"model": {
"format": "OpenAI",
"name": "gpt-4o",
"version": "2024-08-06"
},
"sku": {
"name": "GlobalStandard",
"capacity": 10
}
}
]
}
}
E. Model Selection
Default Model: GPT-4o (gpt-4o) - Version: 2024-08-06 (verified stable) - SKU: GlobalStandard - Capacity: 10K TPM (dev), scalable in prod
Rationale: - Latest stable GPT-4 model - Widely available across Azure regions - Supports function calling (required for agent framework) - Good balance of performance and cost
3. Deployment Workflow
GitHub Actions Trigger: - Manual dispatch via Actions UI - Input: environment selection (dev/staging/prod) - Input: optional model override (defaults to parameter file)
Workflow Steps: 1. Checkout repository 2. Azure OIDC login 3. Install PowerShell modules 4. Validate prerequisites (RG, AI Services exist) 5. Deploy models via Bicep 6. Display deployment results 7. Create GitHub step summary
Error Handling: - Validates all prerequisites before deployment - Fails fast with clear error messages - Shows detailed deployment errors - Proper exit codes for CI/CD integration
4. Application Integration
Endpoint Information Provided:
Base Endpoint: https://ldfdev-ai.cognitiveservices.azure.com/
Deployment Name: gpt-4o
API Version: 2024-12-01-preview
Full Chat Completions URL:
POST https://ldfdev-ai.cognitiveservices.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-12-01-preview
Environment Variables for Container Apps:
AZURE_OPENAI_ENDPOINT=https://ldfdev-ai.cognitiveservices.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o
AZURE_OPENAI_API_VERSION=2024-12-01-preview
Authentication: Managed Identity (DefaultAzureCredential) - No API keys in environment variables - Container Apps use system-assigned managed identity - RBAC role: "Cognitive Services OpenAI User"
5. Testing Strategy
Local Testing (before CI/CD):
CI/CD Testing: 1. Run workflow via GitHub Actions UI 2. Verify deployment in Azure Portal 3. Test endpoint with curl/SDK 4. Validate in application integration tests
Validation Checklist: - [ ] Model deploys successfully - [ ] Endpoint is accessible - [ ] Model responds to test queries - [ ] Managed identity has correct permissions - [ ] Environment variables set in Container Apps - [ ] Agent framework can use deployed model
Implementation Status
✅ Completed
- GitHub Actions workflow created
- Bicep module for model deployments
- Parameter files for all environments
- Manual deployment script
- OIDC authentication configured
- PowerShell deployment implementation
- Error handling and validation
- Deployment outputs and summaries
🔧 Configuration Fixes Applied
- Fixed resource naming pattern (ldf{env} format)
- Updated parameter files with correct AI Services names
- Verified model availability (gpt-4o)
- Added proper SKU configuration
- Validated API versions
📝 Documentation Created
- This ADR documenting the decision
- Workflow documentation in comments
- Script usage examples
- Integration guide for application teams
Consequences
Positive
- Automated Deployments: No manual model deployment via Portal
- Consistency: Same process for all environments
- Idempotent: Safe to run multiple times
- Secure: OIDC authentication, no secrets in code
- Reliable: PowerShell avoids Azure CLI bugs
- Transparent: Clear outputs and error messages
- Testable: Can test locally before CI/CD
- Documented: Clear integration guide for apps
Negative
- Initial Setup: Required OIDC configuration in Azure
- PowerShell Dependency: Team needs PowerShell knowledge
- Model Versioning: Must manually update versions in parameters
- Regional Limits: Some models not available in all regions
Neutral
- Manual Trigger: Deployment on-demand vs automatic
- Parameter Files: Need to maintain per-environment configs
- Cost: Model deployment costs charged by Azure
Alternatives Considered
1. Azure CLI Deployment
Rejected: Azure CLI has known bug with "content already consumed" errors in Bicep deployments (per ADR-023).
2. Terraform
Rejected: Team already standardized on Bicep for Azure infrastructure. Adding Terraform increases complexity.
3. Manual Portal Deployment
Rejected: Not repeatable, error-prone, no audit trail, not suitable for multiple environments.
4. Azure Bicep Registry Modules
Considered: Using AVM (Azure Verified Modules) for model deployments. Status: No official AVM module exists for model deployments yet. Custom module is appropriate.
Related Documentation
- Workflow:
.github/workflows/deploy-ai-models.yml - Bicep Module:
infrastructure/bicep/modules/ai-model-deployments.bicep - Parameter Files:
infrastructure/bicep/environments/*-models.parameters.json - Deployment Script:
infrastructure/scripts/deploy-models.sh - ADR-021: Azure Verified Modules Adoption
- ADR-023: PowerShell for Azure Deployments
Maintenance
Regular Tasks
- Review model versions quarterly (new GPT models released)
- Update API versions as Azure releases new versions
- Monitor model capacity and adjust as needed
- Review costs monthly and optimize capacity
When to Update
- New model versions released by OpenAI/Microsoft
- API version deprecation notices
- Capacity needs change (scaling up/down)
- New models needed for additional features
Approval: Implemented and deployed to dev environment
Next Review: 2025-Q1 (quarterly model version review)