Skip to content

ADR-024: AI Foundry Integration with Individual AVM Resources

Status: Accepted Date: 2025-01-11 Decision Makers: Development Team Related ADRs: ADR-021: Azure Verified Modules Adoption

Context

We need to provide AI model deployment and management capabilities through Azure AI Foundry portal (ai.azure.com) while maintaining our staged deployment architecture and Zero Trust security principles.

Initial Investigation: AI Foundry Pattern Module

We initially evaluated the avm/ptn/ai-ml/ai-foundry pattern module, which promises: - Complete AI platform deployment - Pre-configured AI Foundry workspace - Model deployment management - Integrated monitoring and security

Pattern Module Limitations Discovered

Critical Issues with Pattern Module: 1. Can't reuse existing resources: Parameter includeAssociatedResources: false prevents creating Key Vault/Storage, but NO option exists to pass existing resource IDs 2. Resource duplication: Setting includeAssociatedResources: true creates NEW Key Vault and Storage, duplicating our security stage resources 3. Limited outputs: Module only outputs resource names, not the resource IDs and endpoints needed for API consumption 4. All-or-nothing approach: Pattern module is designed for greenfield deployments, not our staged/modular architecture

Conclusion: AI Foundry pattern module is incompatible with our staged deployment approach where we reuse existing security infrastructure.

Requirements

  1. AI Foundry Portal Access: Deploy resources visible in Azure AI Foundry portal (ai.azure.com)
  2. Model Deployment Flexibility: Deploy any AI models from catalog via portal UI (not hardcoded in Bicep)
  3. Secure Endpoints: Private, Zero Trust endpoints for API consumption
  4. Staged Deployment: Reuse existing Key Vault, Storage, VNet from earlier stages
  5. Individual AVM Control: Full control over each resource configuration

Decision

We will use individual AVM resource modules with AI Foundry Hub and Project resources for portal integration.

Architecture

Individual AVM Modules + AI Foundry Resources:
├── Log Analytics Workspace (avm/res/operational-insights/workspace)
├── Application Insights (avm/res/insights/component)
├── Azure AI Services (avm/res/cognitive-services/account)
│   ├── Kind: "AIServices" (multi-service)
│   ├── Private Endpoint (no public access)
│   └── RBAC for Managed Identity
├── Private DNS Zone (avm/res/network/private-dns-zone)
├── AI Foundry Hub (Microsoft.MachineLearningServices/workspaces)
│   └── Kind: "Hub"
└── AI Foundry Project (Microsoft.MachineLearningServices/workspaces)
    ├── Kind: "Project"
    └── Links to: Hub, AI Services, App Insights

How It Works

1. Individual AVM Resources

Deploy using proven AVM modules with full control: - Log Analytics: Centralized logging - Application Insights: APM and telemetry - AI Services: Cognitive Services (multi-service, includes OpenAI) - Private DNS Zone: Name resolution for private endpoints

2. AI Foundry Hub

Lightweight wrapper resource that: - Registers AI Services with AI Foundry portal - Provides workspace for project management - Links to Application Insights for monitoring - Zero Trust: publicNetworkAccess: 'Disabled'

resource aiFoundryHub 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
  name: aiFoundryHubName
  kind: 'Hub'
  properties: {
    description: 'AI Foundry Hub for Loan Defenders'
    publicNetworkAccess: 'Disabled'
    applicationInsights: appInsights.outputs.resourceId
  }
}

3. AI Foundry Project

Project resource within Hub that: - Provides UI/portal interface in ai.azure.com - Links to Hub for resource management - Enables model deployment via portal - Zero Trust: publicNetworkAccess: 'Disabled'

resource aiFoundryProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
  name: '${aiFoundryHubName}-project'
  kind: 'Project'
  properties: {
    description: 'AI project for Loan Defenders multi-agent system'
    hubResourceId: aiFoundryHub.id
    publicNetworkAccess: 'Disabled'
  }
}

Security Configuration

Private Endpoints:

privateEndpoints: [
  {
    name: '${aiServicesName}-pe'
    subnetResourceId: privateEndpointsSubnetId
    privateDnsZoneGroup: {
      privateDnsZoneGroupConfigs: [
        {
          name: 'cognitive-services-dns'
          privateDnsZoneResourceId: privateDnsZone.outputs.resourceId
        }
      ]
    }
  }
]

RBAC (Managed Identity access):

roleAssignments: [
  {
    principalId: managedIdentityPrincipalId
    principalType: 'ServicePrincipal'
    roleDefinitionIdOrName: 'Cognitive Services OpenAI User'
  }
  {
    principalId: managedIdentityPrincipalId
    principalType: 'ServicePrincipal'
    roleDefinitionIdOrName: 'Cognitive Services User'
  }
]

API Endpoint Usage

Secure Endpoint for API Consumption:

// Output from ai-services.bicep
output aiServicesEndpoint string = aiServices.outputs.endpoint
output aiServicesPrivateEndpointFqdn string = '${aiServicesName}.cognitiveservices.azure.com'

API Configuration (apps/api/loan_defenders/api/config.py):

# Use private endpoint FQDN (resolves to private IP via DNS zone)
azure_ai_endpoint: str = "https://<ai-services-name>.cognitiveservices.azure.com"

How API Connects: 1. API container uses Managed Identity (no keys) 2. DNS query for <name>.cognitiveservices.azure.com 3. Private DNS zone resolves to private IP (10.0.3.x) 4. Traffic stays within VNet (Azure backbone) 5. Managed Identity authenticated by Azure AD

Model Deployment Strategy

Hybrid Deployment Approach (Lean of Bicep, Experimentation by Portal):

1. Production Models → Bicep (Infrastructure as Code):

// Dev environment: GPT-4.1 deployed automatically
param modelDeployments array = [
  {
    name: 'gpt-4.1'
    model: {
      format: 'OpenAI'
      name: 'gpt-4.1'
      version: '2025-04-14'
    }
    sku: { name: 'GlobalStandard', capacity: 10 }  // Global deployment for better availability
  }
]

// Prod environment: Empty by default - admin configures based on architecture
param modelDeployments array = []

2. Experimental Models → Portal (Testing and Validation): - Access portal at https://ai.azure.com - Portal → "Deployments" → "Create new deployment" - Select model from catalog (test newer versions) - Configure: name, version, capacity - Portal deploys to AI Services resource - No Bicep changes needed - fully dynamic

API Configuration:

# Use deployed model with correct API version
azure_ai_deployment_name: str = "gpt-4.1"
azure_openai_api_version: str = "2024-12-01-preview"
azure_ai_endpoint: str = "https://<name>.cognitiveservices.azure.com"

Benefits: - ✅ Bicep for stable production deployments (version controlled) - ✅ Portal for testing and experimentation (dynamic, flexible) - ✅ No hardcoding models in infrastructure - ✅ Monitor usage and performance in portal - ✅ Future: Prompt flow, RAG, fine-tuning, Azure AI Search

Consequences

Benefits

  1. ✅ Full Control: Individual AVM modules give precise configuration control
  2. ✅ Staged Deployment: Reuses existing Key Vault, Storage, VNet, Managed Identity
  3. ✅ AI Foundry Portal: Full portal access for model management at https://ai.azure.com
  4. ✅ Model Flexibility: Deploy any models dynamically via portal (not hardcoded)
  5. ✅ Secure Endpoints: Private endpoints with Zero Trust architecture
  6. ✅ RBAC Automated: Managed Identity permissions configured in Bicep
  7. ✅ Future-Ready: Easy to add Azure AI Search, RAG, prompt flow when needed
  8. ✅ Cost Efficient: Only deploy resources we actually use

Trade-offs

  1. ⚠️ More Resources: Hub + Project resources added (lightweight, minimal cost ~$5/month)
  2. ⚠️ Manual Hub/Project: AI Foundry Hub and Project not available as AVM modules yet
  3. ⚠️ Two Deployment Approaches: Individual AVMs for resources + custom resources for Hub/Project

Comparison to Pattern Module

Aspect Pattern Module Our Approach
Resource Reuse ❌ Creates new KeyVault/Storage ✅ Reuses existing from security stage
Control ⚠️ Limited config options ✅ Full control per resource
Staged Deployment ❌ All-or-nothing ✅ Works with staged approach
Portal Access ✅ Included ✅ Included (Hub + Project)
Model Deployment ⚠️ Configured in Bicep ✅ Dynamic via portal
Endpoints ⚠️ Limited outputs ✅ Full endpoint details
Zero Trust ✅ Supported ✅ Fully implemented

Implementation

Module Structure

File: infrastructure/bicep/modules/ai-services.bicep

// Individual AVM modules
module logAnalytics 'br/public:avm/res/operational-insights/workspace:0.12.0'
module appInsights 'br/public:avm/res/insights/component:0.6.0'
module aiServices 'br/public:avm/res/cognitive-services/account:0.13.2'
module privateDnsZone 'br/public:avm/res/network/private-dns-zone:0.7.0'

// Optional: Azure AI Search for RAG (disabled by default)
module aiSearch 'br/public:avm/res/search/search-service:0.8.0' = if (deployAISearch)

// Custom resources (no AVM available yet)
resource aiFoundryHub 'Microsoft.MachineLearningServices/workspaces@2024-04-01'
resource aiFoundryProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01'

// Model deployments (optional - can also deploy via portal)
resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = [for deployment in modelDeployments: {...}]

Configuration Parameters: - modelDeployments: Array of models to deploy via Bicep (empty for portal-only deployment) - deployAISearch: Boolean to enable Azure AI Search for future RAG capabilities (default: false)

Deployment Sequence

Stage 1: Foundation (networking)
Stage 2: Security (Key Vault, Storage, Identity)
Stage 3: AI Services (AI Services, Hub, Project, private endpoints, model deployments)
Stage 4: Apps (Container Apps)

AI Stage deploys: 1. Log Analytics Workspace 2. Application Insights 3. Private DNS Zone for Cognitive Services 4. Azure AI Services with private endpoint 5. AI Foundry Hub (links to App Insights) 6. AI Foundry Project (links to Hub) 7. Model Deployments (optional - configurable via parameters) 8. Azure AI Search (optional - disabled by default)

Default Configuration:

Dev Environment (environments/dev.parameters.json): - GPT-4.1 (April 2025 version) deployed automatically - Model version: 2025-04-14 - API version: 2024-12-01-preview - 10K tokens per minute capacity - Azure AI Search: Disabled (can be enabled when needed)

Prod Environment (environments/prod.parameters.json): - Empty model deployments array (administrator must configure) - Azure AI Search: Disabled - Administrator configures based on architectural requirements

Validation

Bicep Compilation: Template compiles successfully with no errors ✅ Private Endpoints: AI Services configured with private endpoint ✅ RBAC: Managed Identity has Cognitive Services roles ✅ DNS Resolution: Private DNS zone linked to VNet ✅ Portal Integration: Hub and Project resources created ✅ Secure Endpoints: Public network access disabled ✅ Model Deployments: GPT-4.1 deployed automatically in dev environment ✅ Azure AI Search: Module added but disabled by default (ready for RAG)

References

Decision Log

  1. Evaluated AI Foundry pattern module - Found incompatible with staged deployment
  2. Discovered Hub + Project approach - Lightweight resources provide portal access
  3. Tested individual AVMs + custom resources - Successful compilation and validation
  4. Accepted hybrid approach - Individual AVMs where available, custom resources where needed

Notes

  • AI Foundry Hub and Project are NOT pattern modules - they're simple resource wrappers
  • Models can be deployed via Bicep (automated) or portal (dynamic) - hybrid approach recommended
  • Dev environment: GPT-4.1 deployed automatically for immediate use
  • Prod environment: Empty by default - administrator configures based on architecture
  • API version 2024-12-01-preview required for GPT-4.1 deployments
  • Models deployed via portal appear immediately in AI Services (no infrastructure changes)
  • Portal URL https://ai.azure.com works with ANY Azure AI Services (not just pattern module deployments)
  • Azure AI Search included but disabled by default (ready for future RAG implementation)
  • Future AVM modules may become available for Hub and Project resources