Skip to content

ADR-038: Service Principal Least Privilege for Docker Development

Status: Accepted Date: 2025-10-14 Deciders: Development Team Tags: security, docker, azure, authentication

Context

Docker containers cannot use Azure CLI authentication (az login) because: 1. Requires mounting .azure folder (platform-specific, unreliable) 2. Requires installing Azure CLI binary in containers (+500MB per image) 3. Uses developer's personal credentials (broad Owner role on subscription) 4. Different authentication pattern from production (which uses Managed Identity)

We needed a secure, production-aligned authentication method for Docker development that follows the principle of least privilege.

Research Findings

Endjin Blog Approach (https://endjin.com/blog/2022/09/using-azcli-authentication-within-local-containers): - Addresses Windows-specific token encryption issue in Azure CLI v2.33+ - Solution: Mount .azure folder into containers - Why it doesn't work for us: - We're on macOS (no encryption issue) - Requires az binary in container (we don't install it) - AzureCliCredential needs the binary to refresh tokens - Adds 500MB+ to each container image - Uses developer's broad permissions (Owner role)

Service Principal vs Azure CLI Comparison:

Aspect Azure CLI Mount Service Principal (Our Choice)
Container Size +500MB (requires Azure CLI binary) +0MB
User Permissions Owner (entire subscription) Scoped to one resource
Least Privilege ❌ Broad access ✅ Inference only
Production Pattern Different (uses Managed Identity) Same (EnvironmentCredential)
Platform Support Platform-specific mounts Works everywhere
Token Management Expires hourly Auto-refreshes

Decision

Use Service Principal with "Cognitive Services OpenAI User" role for Docker container authentication.

Implementation

  1. Create Service Principal: ./scripts/create-service-principal.sh
  2. Automates service principal creation
  3. Assigns least-privilege role
  4. Adds credentials to .env automatically

  5. Least Privilege Role: "Cognitive Services OpenAI User"

  6. CAN:
    • Call AI inference APIs (GPT-4 completions)
    • Read model deployments and configurations
  7. CANNOT:
    • List or read API keys (listkeys/action permission denied)
    • Create/modify/delete deployments
    • Access other Azure resources
  8. Scoped: Single AI Foundry resource only (not entire subscription)

  9. Token Rotation: 90-day minimum

  10. Service principal secrets shown only once
  11. Default: No expiration (rotate manually)
  12. Production: Use Azure Key Vault for automatic rotation

Role Comparison

Role List Keys? Modify Deployments? Inference? Use Case
Cognitive Services OpenAI User Our choice - Least privilege
Cognitive Services User Too much (can list keys)
Cognitive Services Contributor Too much (can modify)
Owner (Azure CLI user) WAY too much (entire subscription)

Authentication Chain

The application uses DefaultAzureCredential which tries credentials in order:

  1. EnvironmentCredential (Docker containers) - Service Principal via:
  2. AZURE_TENANT_ID
  3. AZURE_CLIENT_ID
  4. AZURE_CLIENT_SECRET

  5. ManagedIdentityCredential (Azure production) - Automatic, no configuration

  6. AzureCliCredential (local development) - Uses az login, no configuration

This ensures the same authentication code works across all environments.

Consequences

Positive

  1. Least Privilege: Service principal can ONLY call inference APIs
  2. Cannot list API keys
  3. Cannot modify deployments
  4. Cannot access other Azure resources

  5. Production-Aligned: Same credential chain as production

  6. EnvironmentCredential in Docker
  7. ManagedIdentityCredential in Azure Container Apps
  8. Smooth transition from dev to prod

  9. Smaller Containers: No Azure CLI binary needed (+500MB savings per image)

  10. Cross-Platform: Works on macOS, Linux, Windows (no platform-specific mounts)

  11. Automatic Token Refresh: Service principal tokens auto-refresh (unlike Azure CLI hourly expiration)

  12. Scoped Access: Service principal limited to single AI Foundry resource

  13. Not entire subscription (like Azure CLI user's Owner role)

Negative

  1. Manual Setup Required: Developers must run ./scripts/create-service-principal.sh once
  2. Azure CLI user permissions needed: Owner, Application Administrator, or User Access Administrator
  3. More initial setup than az login alone

  4. Credential Management: Service principal secrets must be protected

  5. Stored in .env file (git-ignored)
  6. Shown only once, cannot be retrieved later
  7. Must be rotated every 90 days

  8. Permission Requirements: Not all developers can create service principals

  9. Requires Azure AD permissions
  10. May need Azure admin assistance in enterprise environments

Mitigations

  1. Automated Script: ./scripts/create-service-principal.sh handles all complexity
  2. Prompts for resource name
  3. Creates service principal with correct role
  4. Updates .env file automatically
  5. Shows clear security warnings

  6. Clear Documentation: docs/getting-started/docker-development.md

  7. Prerequisites section lists required Azure roles
  8. Step-by-step troubleshooting for permission errors
  9. Security best practices section

  10. Token Rotation Guide: Documentation includes rotation instructions

  11. Option 1: Delete and recreate (simpler)
  12. Option 2: Create new secret (keeps existing)

User Permission Requirements

To create service principals, developers need one of these roles on their Azure subscription:

  • Owner role (full access)
  • Application Administrator role (can create service principals)
  • User Access Administrator + Contributor roles (combined)

Check your role:

az role assignment list --assignee $(az account show --query user.name -o tsv) --query '[].roleDefinitionName'

If you don't have these permissions, contact your Azure admin.

Security Best Practices

1. Least Privilege (✅ Implemented)

Our service principal uses the minimum permissions needed: - Role: "Cognitive Services OpenAI User" (inference only) - Scope: Single AI Foundry resource (not entire subscription) - Cannot list keys, modify deployments, or access other resources

2. Token Rotation

Service principal secrets should be rotated regularly: - Development: Every 90 days minimum - Production: Use Azure Key Vault for automatic rotation

Rotate credentials:

# Option 1: Delete and recreate (simpler)
az ad sp delete --id $(az ad sp list --display-name "loan-defenders-docker-dev" --query '[0].appId' -o tsv)
./scripts/create-service-principal.sh

# Option 2: Create new secret (keeps existing)
az ad sp credential reset --id $(az ad sp list --display-name "loan-defenders-docker-dev" --query '[0].appId' -o tsv)
# Update .env with new secret

3. Never Commit Credentials

Credentials are protected by git:

# .gitignore
.env         # ← Credentials stored here
.env.local

If accidentally committed:

# IMMEDIATELY rotate credentials
az ad sp credential reset --id <app-id>

# Remove from git history (ask admin for help)

4. Production: Use Managed Identity

For Azure Container Apps (production): - Don't use service principals - Use Managed Identity instead (no credentials!) - Automatically provided by Azure - More secure, no secrets to manage

Alternatives Considered

Alternative 1: Mount Azure CLI Credentials

Rejected - Requires Azure CLI binary in containers (+500MB), uses developer's broad Owner permissions, platform-specific

Alternative 2: API Key Authentication

Rejected - API keys are static secrets (no rotation), cannot be scoped to specific resources, deprecated by Microsoft

Alternative 3: Use "Cognitive Services User" Role

Rejected - Grants listkeys/action permission (can list API keys), violates least privilege principle

Alternative 4: Use "Cognitive Services Contributor" Role

Rejected - Can modify deployments, far more permissions than needed for inference

References

  • Azure RBAC Documentation: https://learn.microsoft.com/azure/role-based-access-control/
  • Cognitive Services Roles: https://learn.microsoft.com/azure/ai-services/openai/how-to/role-based-access-control
  • DefaultAzureCredential: https://learn.microsoft.com/python/api/azure-identity/azure.identity.defaultazurecredential
  • Endjin Blog (alternative approach): https://endjin.com/blog/2022/09/using-azcli-authentication-within-local-containers

Implementation

Created Files

  • scripts/create-service-principal.sh - Automated service principal creation script

Updated Files

  • docs/getting-started/docker-development.md - Docker setup documentation with least-privilege guidance
  • .env.example - Added comprehensive security comments and token rotation guidance

Deleted Files

  • docs/getting-started/docker-in-devcontainer.md - Removed Docker-in-Docker approach (not supported)
  • ADR-037: Environment-Based Configuration - How environment variables are loaded
  • ADR-XXX: Azure Container Apps Deployment - Production Managed Identity usage

Lessons Learned

  1. Research before implementing: The Endjin blog approach looked promising but didn't fit our requirements
  2. Least privilege is achievable: Azure provides granular roles for specific use cases
  3. Document user requirements: Not all developers can create service principals - document prerequisites
  4. Automate security: Script handles complexity and enforces best practices
  5. Production alignment matters: Using same credential chain (EnvironmentCredential) in dev and prod reduces surprises