ADR-061: Enhanced Audit Trail Logging with Automatic Application ID Injection

Status: Accepted Date: 2025-11-28 Deciders: Engineering Team Supersedes: Extends ADR-057 (Unified Observability)

Context

The multi-agent loan processing system needed enhanced audit trail capabilities for: - Regulatory compliance (all loan decisions must be traceable) - Debugging distributed agent workflows - Correlating logs across 5+ components (API, 3 MCP servers, agents) - PII protection while maintaining audit completeness

Problems with initial ADR-057 implementation: - No automatic correlation of logs by loan application - Difficult to filter all logs for a specific loan application - MCP servers ran as separate processes with no context sharing - Financial data (income, loan amounts) logged in plain text

Goal: Enable one-click filtering of all logs by loan.application_id across all system components.

Decision

1. Automatic Application ID Injection via ContextVars

Use Python's contextvars to automatically inject loan.application_id into ALL log records:

# Module-level context variables
loan_application_id_var: ContextVar[str | None] = ContextVar("loan_application_id", default=None)

class JsonExtraFormatter(logging.Formatter):
    def format(self, record: logging.LogRecord) -> str:
        # Auto-inject into every log record
        loan_app_id = loan_application_id_var.get()
        if loan_app_id:
            record.__dict__["loan.application_id"] = loan_app_id
        return super().format(record)

2. Context Manager for Safe Cleanup

Ensure context cleanup even on exceptions or async cancellation:

@contextmanager
def loan_application_context(application_id: str):
    """Context manager for safe loan application ID cleanup."""
    token = loan_application_id_var.set(application_id)
    try:
        yield
    finally:
        loan_application_id_var.reset(token)

3. OTEL Span Events for Audit Trail

Use OpenTelemetry span events instead of custom AuditLogger:

span = trace.get_current_span()
span.add_event("workflow_started", {
    "loan.application_id": application.application_id,
    "loan_amount_range": Observability.mask_loan_amount(application.loan_amount),
})

4. MCP Server Correlation via Parameter

Pass application_id to all MCP tool calls for cross-process correlation:

# In agent personas
Pass `application_id` for log correlation.

# In MCP server tools
@mcp.tool()
def validate_basic_parameters(application_data: dict, application_id: str = None):
    if application_id:
        loan_application_id_var.set(application_id)

5. Financial Data Masking

New masking methods for financial audit data:

Method	Example Input	Example Output
`mask_income(125000)`	125000	"$100k-$150k"
`mask_loan_amount(350000)`	350000	"$300k-$400k"
`mask_credit_score(720)`	720	"700-749 (good)"
`mask_dti_ratio(0.35)`	0.35	"30-40% (moderate)"

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Loan Application Request                      │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  API (FastAPI)                                                   │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │  loan_application_context(application_id)                   │ │
│  │  - Sets ContextVar                                          │ │
│  │  - All subsequent logs auto-tagged                          │ │
│  └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│ MCP Server 1  │     │ MCP Server 2  │     │ MCP Server 3  │
│ (8010)        │     │ (8011)        │     │ (8012)        │
│               │     │               │     │               │
│ application_id│     │ application_id│     │ application_id│
│ param injected│     │ param injected│     │ param injected│
└───────────────┘     └───────────────┘     └───────────────┘
        │                     │                     │
        └─────────────────────┼─────────────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Seq / Azure Monitor                                             │
│  Filter: loan.application_id = "LN0123456789"                   │
│  Shows ALL logs from API + all 3 MCP servers                    │
└─────────────────────────────────────────────────────────────────┘

Implementation

Files Created/Modified

New Files: - loan_defenders_utils/src/loan_defenders_utils/workflow_observability.py - WorkflowLogger class

Modified Files: - loan_defenders_utils/src/loan_defenders_utils/observability.py - Added ContextVars for loan_application_id - Added JsonExtraFormatter with auto-injection - Added financial masking methods - Added context manager - apps/api/loan_defenders/orchestrators/sequential_pipeline.py - Use loan_application_context() context manager - Add OTEL span events for audit trail - apps/mcp_servers/*/server.py (all 3) - Add application_id parameter to all tools - Set ContextVar on tool invocation - apps/api/loan_defenders/agents/agent-persona/*.md (all 4) - Document application_id parameter in tool signatures

Key Code Changes

# sequential_pipeline.py
async def process_application(self, application: LoanApplication):
    with Observability.loan_application_context(application.application_id):
        span = trace.get_current_span()
        span.add_event("workflow_started", {
            "loan.application_id": application.application_id,
            "loan_amount_range": Observability.mask_loan_amount(application.loan_amount),
        })
        # ... process agents ...

Consequences

Positive

One-click filtering: Query loan.application_id = "LN..." shows all logs
Cross-process correlation: MCP servers logs linked to loan applications
PII protection: Financial data logged as ranges, not exact values
OTEL integration: Audit events are span events, auto-exported to Azure Monitor
Safe cleanup: Context manager prevents ID leakage between requests

Negative

Parameter overhead: MCP tools require application_id parameter
Agent persona updates: Each persona must document the parameter
Slight complexity: ContextVars require understanding for debugging

Risks & Mitigations

Risk	Mitigation
Context leakage	Context manager with try/finally
MCP server forgets to log	Parameter is required in tool signature
Performance impact	ContextVar lookup is O(1), negligible

Rejected Alternatives

Custom AuditLogger class: Initially implemented, then replaced with OTEL span events
OTEL events auto-export to Azure Monitor
Less custom code to maintain
Follows industry standards
Request-scoped logging: Would require significant refactoring
ContextVars work with async/await natively
No framework changes needed
Database-only audit: Not queryable in real-time
Logs + OTEL provide immediate visibility
Database is the source of truth for compliance

Usage Examples

Filtering in Seq

loan.application_id = "LN0123456789"

Shows all logs from: - API endpoint receiving request - Sequential pipeline orchestration - Each agent's processing - All MCP server tool calls - Final decision logging

Filtering in Azure Monitor (KQL)

traces
| where customDimensions["loan.application_id"] == "LN0123456789"
| order by timestamp asc

ADR-057: Unified Observability and Logging Strategy
ADR-060: Agent Persona Token Optimization
PR #197: Implementation
Issue #196: Enhance logging with business-level audit trail

Status: Accepted and Implemented Implementation Date: 2025-11-28 Production Deployment: Ready Next Review: 2026-02-28