ADR-024: AI Foundry Integration with Individual AVM Resources
Status: Accepted Date: 2025-01-11 Decision Makers: Development Team Related ADRs: ADR-021: Azure Verified Modules Adoption
Context
We need to provide AI model deployment and management capabilities through Azure AI Foundry portal (ai.azure.com) while maintaining our staged deployment architecture and Zero Trust security principles.
Initial Investigation: AI Foundry Pattern Module
We initially evaluated the avm/ptn/ai-ml/ai-foundry pattern module, which promises:
- Complete AI platform deployment
- Pre-configured AI Foundry workspace
- Model deployment management
- Integrated monitoring and security
Pattern Module Limitations Discovered
Critical Issues with Pattern Module:
1. Can't reuse existing resources: Parameter includeAssociatedResources: false prevents creating Key Vault/Storage, but NO option exists to pass existing resource IDs
2. Resource duplication: Setting includeAssociatedResources: true creates NEW Key Vault and Storage, duplicating our security stage resources
3. Limited outputs: Module only outputs resource names, not the resource IDs and endpoints needed for API consumption
4. All-or-nothing approach: Pattern module is designed for greenfield deployments, not our staged/modular architecture
Conclusion: AI Foundry pattern module is incompatible with our staged deployment approach where we reuse existing security infrastructure.
Requirements
- AI Foundry Portal Access: Deploy resources visible in Azure AI Foundry portal (ai.azure.com)
- Model Deployment Flexibility: Deploy any AI models from catalog via portal UI (not hardcoded in Bicep)
- Secure Endpoints: Private, Zero Trust endpoints for API consumption
- Staged Deployment: Reuse existing Key Vault, Storage, VNet from earlier stages
- Individual AVM Control: Full control over each resource configuration
Decision
We will use individual AVM resource modules with AI Foundry Hub and Project resources for portal integration.
Architecture
Individual AVM Modules + AI Foundry Resources:
├── Log Analytics Workspace (avm/res/operational-insights/workspace)
├── Application Insights (avm/res/insights/component)
├── Azure AI Services (avm/res/cognitive-services/account)
│ ├── Kind: "AIServices" (multi-service)
│ ├── Private Endpoint (no public access)
│ └── RBAC for Managed Identity
├── Private DNS Zone (avm/res/network/private-dns-zone)
├── AI Foundry Hub (Microsoft.MachineLearningServices/workspaces)
│ └── Kind: "Hub"
└── AI Foundry Project (Microsoft.MachineLearningServices/workspaces)
├── Kind: "Project"
└── Links to: Hub, AI Services, App Insights
How It Works
1. Individual AVM Resources
Deploy using proven AVM modules with full control: - Log Analytics: Centralized logging - Application Insights: APM and telemetry - AI Services: Cognitive Services (multi-service, includes OpenAI) - Private DNS Zone: Name resolution for private endpoints
2. AI Foundry Hub
Lightweight wrapper resource that:
- Registers AI Services with AI Foundry portal
- Provides workspace for project management
- Links to Application Insights for monitoring
- Zero Trust: publicNetworkAccess: 'Disabled'
resource aiFoundryHub 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
name: aiFoundryHubName
kind: 'Hub'
properties: {
description: 'AI Foundry Hub for Loan Defenders'
publicNetworkAccess: 'Disabled'
applicationInsights: appInsights.outputs.resourceId
}
}
3. AI Foundry Project
Project resource within Hub that:
- Provides UI/portal interface in ai.azure.com
- Links to Hub for resource management
- Enables model deployment via portal
- Zero Trust: publicNetworkAccess: 'Disabled'
resource aiFoundryProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
name: '${aiFoundryHubName}-project'
kind: 'Project'
properties: {
description: 'AI project for Loan Defenders multi-agent system'
hubResourceId: aiFoundryHub.id
publicNetworkAccess: 'Disabled'
}
}
Security Configuration
Private Endpoints:
privateEndpoints: [
{
name: '${aiServicesName}-pe'
subnetResourceId: privateEndpointsSubnetId
privateDnsZoneGroup: {
privateDnsZoneGroupConfigs: [
{
name: 'cognitive-services-dns'
privateDnsZoneResourceId: privateDnsZone.outputs.resourceId
}
]
}
}
]
RBAC (Managed Identity access):
roleAssignments: [
{
principalId: managedIdentityPrincipalId
principalType: 'ServicePrincipal'
roleDefinitionIdOrName: 'Cognitive Services OpenAI User'
}
{
principalId: managedIdentityPrincipalId
principalType: 'ServicePrincipal'
roleDefinitionIdOrName: 'Cognitive Services User'
}
]
API Endpoint Usage
Secure Endpoint for API Consumption:
// Output from ai-services.bicep
output aiServicesEndpoint string = aiServices.outputs.endpoint
output aiServicesPrivateEndpointFqdn string = '${aiServicesName}.cognitiveservices.azure.com'
API Configuration (apps/api/loan_defenders/api/config.py):
# Use private endpoint FQDN (resolves to private IP via DNS zone)
azure_ai_endpoint: str = "https://<ai-services-name>.cognitiveservices.azure.com"
How API Connects:
1. API container uses Managed Identity (no keys)
2. DNS query for <name>.cognitiveservices.azure.com
3. Private DNS zone resolves to private IP (10.0.3.x)
4. Traffic stays within VNet (Azure backbone)
5. Managed Identity authenticated by Azure AD
Model Deployment Strategy
Hybrid Deployment Approach (Lean of Bicep, Experimentation by Portal):
1. Production Models → Bicep (Infrastructure as Code):
// Dev environment: GPT-4.1 deployed automatically
param modelDeployments array = [
{
name: 'gpt-4.1'
model: {
format: 'OpenAI'
name: 'gpt-4.1'
version: '2025-04-14'
}
sku: { name: 'GlobalStandard', capacity: 10 } // Global deployment for better availability
}
]
// Prod environment: Empty by default - admin configures based on architecture
param modelDeployments array = []
2. Experimental Models → Portal (Testing and Validation): - Access portal at https://ai.azure.com - Portal → "Deployments" → "Create new deployment" - Select model from catalog (test newer versions) - Configure: name, version, capacity - Portal deploys to AI Services resource - No Bicep changes needed - fully dynamic
API Configuration:
# Use deployed model with correct API version
azure_ai_deployment_name: str = "gpt-4.1"
azure_openai_api_version: str = "2024-12-01-preview"
azure_ai_endpoint: str = "https://<name>.cognitiveservices.azure.com"
Benefits: - ✅ Bicep for stable production deployments (version controlled) - ✅ Portal for testing and experimentation (dynamic, flexible) - ✅ No hardcoding models in infrastructure - ✅ Monitor usage and performance in portal - ✅ Future: Prompt flow, RAG, fine-tuning, Azure AI Search
Consequences
Benefits
- ✅ Full Control: Individual AVM modules give precise configuration control
- ✅ Staged Deployment: Reuses existing Key Vault, Storage, VNet, Managed Identity
- ✅ AI Foundry Portal: Full portal access for model management at https://ai.azure.com
- ✅ Model Flexibility: Deploy any models dynamically via portal (not hardcoded)
- ✅ Secure Endpoints: Private endpoints with Zero Trust architecture
- ✅ RBAC Automated: Managed Identity permissions configured in Bicep
- ✅ Future-Ready: Easy to add Azure AI Search, RAG, prompt flow when needed
- ✅ Cost Efficient: Only deploy resources we actually use
Trade-offs
- ⚠️ More Resources: Hub + Project resources added (lightweight, minimal cost ~$5/month)
- ⚠️ Manual Hub/Project: AI Foundry Hub and Project not available as AVM modules yet
- ⚠️ Two Deployment Approaches: Individual AVMs for resources + custom resources for Hub/Project
Comparison to Pattern Module
| Aspect | Pattern Module | Our Approach |
|---|---|---|
| Resource Reuse | ❌ Creates new KeyVault/Storage | ✅ Reuses existing from security stage |
| Control | ⚠️ Limited config options | ✅ Full control per resource |
| Staged Deployment | ❌ All-or-nothing | ✅ Works with staged approach |
| Portal Access | ✅ Included | ✅ Included (Hub + Project) |
| Model Deployment | ⚠️ Configured in Bicep | ✅ Dynamic via portal |
| Endpoints | ⚠️ Limited outputs | ✅ Full endpoint details |
| Zero Trust | ✅ Supported | ✅ Fully implemented |
Implementation
Module Structure
File: infrastructure/bicep/modules/ai-services.bicep
// Individual AVM modules
module logAnalytics 'br/public:avm/res/operational-insights/workspace:0.12.0'
module appInsights 'br/public:avm/res/insights/component:0.6.0'
module aiServices 'br/public:avm/res/cognitive-services/account:0.13.2'
module privateDnsZone 'br/public:avm/res/network/private-dns-zone:0.7.0'
// Optional: Azure AI Search for RAG (disabled by default)
module aiSearch 'br/public:avm/res/search/search-service:0.8.0' = if (deployAISearch)
// Custom resources (no AVM available yet)
resource aiFoundryHub 'Microsoft.MachineLearningServices/workspaces@2024-04-01'
resource aiFoundryProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01'
// Model deployments (optional - can also deploy via portal)
resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = [for deployment in modelDeployments: {...}]
Configuration Parameters:
- modelDeployments: Array of models to deploy via Bicep (empty for portal-only deployment)
- deployAISearch: Boolean to enable Azure AI Search for future RAG capabilities (default: false)
Deployment Sequence
Stage 1: Foundation (networking)
↓
Stage 2: Security (Key Vault, Storage, Identity)
↓
Stage 3: AI Services (AI Services, Hub, Project, private endpoints, model deployments)
↓
Stage 4: Apps (Container Apps)
AI Stage deploys: 1. Log Analytics Workspace 2. Application Insights 3. Private DNS Zone for Cognitive Services 4. Azure AI Services with private endpoint 5. AI Foundry Hub (links to App Insights) 6. AI Foundry Project (links to Hub) 7. Model Deployments (optional - configurable via parameters) 8. Azure AI Search (optional - disabled by default)
Default Configuration:
Dev Environment (environments/dev.parameters.json):
- GPT-4.1 (April 2025 version) deployed automatically
- Model version: 2025-04-14
- API version: 2024-12-01-preview
- 10K tokens per minute capacity
- Azure AI Search: Disabled (can be enabled when needed)
Prod Environment (environments/prod.parameters.json):
- Empty model deployments array (administrator must configure)
- Azure AI Search: Disabled
- Administrator configures based on architectural requirements
Validation
✅ Bicep Compilation: Template compiles successfully with no errors ✅ Private Endpoints: AI Services configured with private endpoint ✅ RBAC: Managed Identity has Cognitive Services roles ✅ DNS Resolution: Private DNS zone linked to VNet ✅ Portal Integration: Hub and Project resources created ✅ Secure Endpoints: Public network access disabled ✅ Model Deployments: GPT-4.1 deployed automatically in dev environment ✅ Azure AI Search: Module added but disabled by default (ready for RAG)
References
- Azure AI Foundry: Microsoft Documentation
- AVM Cognitive Services: Module Documentation
- Machine Learning Workspaces API: Azure Resource Manager
- Related ADR: ADR-021: Azure Verified Modules Adoption
Decision Log
- Evaluated AI Foundry pattern module - Found incompatible with staged deployment
- Discovered Hub + Project approach - Lightweight resources provide portal access
- Tested individual AVMs + custom resources - Successful compilation and validation
- Accepted hybrid approach - Individual AVMs where available, custom resources where needed
Notes
- AI Foundry Hub and Project are NOT pattern modules - they're simple resource wrappers
- Models can be deployed via Bicep (automated) or portal (dynamic) - hybrid approach recommended
- Dev environment: GPT-4.1 deployed automatically for immediate use
- Prod environment: Empty by default - administrator configures based on architecture
- API version
2024-12-01-previewrequired for GPT-4.1 deployments - Models deployed via portal appear immediately in AI Services (no infrastructure changes)
- Portal URL https://ai.azure.com works with ANY Azure AI Services (not just pattern module deployments)
- Azure AI Search included but disabled by default (ready for future RAG implementation)
- Future AVM modules may become available for Hub and Project resources
Related Documentation
- Model Deployment Guide: AI Model Deployment Guide
- Architecture Overview: Bicep Architecture