ADR-061: Enhanced Audit Trail Logging with Automatic Application ID Injection
Status: Accepted Date: 2025-11-28 Deciders: Engineering Team Supersedes: Extends ADR-057 (Unified Observability)
Context
The multi-agent loan processing system needed enhanced audit trail capabilities for: - Regulatory compliance (all loan decisions must be traceable) - Debugging distributed agent workflows - Correlating logs across 5+ components (API, 3 MCP servers, agents) - PII protection while maintaining audit completeness
Problems with initial ADR-057 implementation: - No automatic correlation of logs by loan application - Difficult to filter all logs for a specific loan application - MCP servers ran as separate processes with no context sharing - Financial data (income, loan amounts) logged in plain text
Goal: Enable one-click filtering of all logs by loan.application_id across all system components.
Decision
1. Automatic Application ID Injection via ContextVars
Use Python's contextvars to automatically inject loan.application_id into ALL log records:
# Module-level context variables
loan_application_id_var: ContextVar[str | None] = ContextVar("loan_application_id", default=None)
class JsonExtraFormatter(logging.Formatter):
def format(self, record: logging.LogRecord) -> str:
# Auto-inject into every log record
loan_app_id = loan_application_id_var.get()
if loan_app_id:
record.__dict__["loan.application_id"] = loan_app_id
return super().format(record)
2. Context Manager for Safe Cleanup
Ensure context cleanup even on exceptions or async cancellation:
@contextmanager
def loan_application_context(application_id: str):
"""Context manager for safe loan application ID cleanup."""
token = loan_application_id_var.set(application_id)
try:
yield
finally:
loan_application_id_var.reset(token)
3. OTEL Span Events for Audit Trail
Use OpenTelemetry span events instead of custom AuditLogger:
span = trace.get_current_span()
span.add_event("workflow_started", {
"loan.application_id": application.application_id,
"loan_amount_range": Observability.mask_loan_amount(application.loan_amount),
})
4. MCP Server Correlation via Parameter
Pass application_id to all MCP tool calls for cross-process correlation:
# In agent personas
Pass `application_id` for log correlation.
# In MCP server tools
@mcp.tool()
def validate_basic_parameters(application_data: dict, application_id: str = None):
if application_id:
loan_application_id_var.set(application_id)
5. Financial Data Masking
New masking methods for financial audit data:
| Method | Example Input | Example Output |
|---|---|---|
mask_income(125000) |
125000 | "$100k-$150k" |
mask_loan_amount(350000) |
350000 | "$300k-$400k" |
mask_credit_score(720) |
720 | "700-749 (good)" |
mask_dti_ratio(0.35) |
0.35 | "30-40% (moderate)" |
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Loan Application Request │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ API (FastAPI) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ loan_application_context(application_id) │ │
│ │ - Sets ContextVar │ │
│ │ - All subsequent logs auto-tagged │ │
│ └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ MCP Server 1 │ │ MCP Server 2 │ │ MCP Server 3 │
│ (8010) │ │ (8011) │ │ (8012) │
│ │ │ │ │ │
│ application_id│ │ application_id│ │ application_id│
│ param injected│ │ param injected│ │ param injected│
└───────────────┘ └───────────────┘ └───────────────┘
│ │ │
└─────────────────────┼─────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Seq / Azure Monitor │
│ Filter: loan.application_id = "LN0123456789" │
│ Shows ALL logs from API + all 3 MCP servers │
└─────────────────────────────────────────────────────────────────┘
Implementation
Files Created/Modified
New Files:
- loan_defenders_utils/src/loan_defenders_utils/workflow_observability.py - WorkflowLogger class
Modified Files:
- loan_defenders_utils/src/loan_defenders_utils/observability.py
- Added ContextVars for loan_application_id
- Added JsonExtraFormatter with auto-injection
- Added financial masking methods
- Added context manager
- apps/api/loan_defenders/orchestrators/sequential_pipeline.py
- Use loan_application_context() context manager
- Add OTEL span events for audit trail
- apps/mcp_servers/*/server.py (all 3)
- Add application_id parameter to all tools
- Set ContextVar on tool invocation
- apps/api/loan_defenders/agents/agent-persona/*.md (all 4)
- Document application_id parameter in tool signatures
Key Code Changes
# sequential_pipeline.py
async def process_application(self, application: LoanApplication):
with Observability.loan_application_context(application.application_id):
span = trace.get_current_span()
span.add_event("workflow_started", {
"loan.application_id": application.application_id,
"loan_amount_range": Observability.mask_loan_amount(application.loan_amount),
})
# ... process agents ...
Consequences
Positive
- One-click filtering: Query
loan.application_id = "LN..."shows all logs - Cross-process correlation: MCP servers logs linked to loan applications
- PII protection: Financial data logged as ranges, not exact values
- OTEL integration: Audit events are span events, auto-exported to Azure Monitor
- Safe cleanup: Context manager prevents ID leakage between requests
Negative
- Parameter overhead: MCP tools require application_id parameter
- Agent persona updates: Each persona must document the parameter
- Slight complexity: ContextVars require understanding for debugging
Risks & Mitigations
| Risk | Mitigation |
|---|---|
| Context leakage | Context manager with try/finally |
| MCP server forgets to log | Parameter is required in tool signature |
| Performance impact | ContextVar lookup is O(1), negligible |
Rejected Alternatives
- Custom AuditLogger class: Initially implemented, then replaced with OTEL span events
- OTEL events auto-export to Azure Monitor
- Less custom code to maintain
-
Follows industry standards
-
Request-scoped logging: Would require significant refactoring
- ContextVars work with async/await natively
-
No framework changes needed
-
Database-only audit: Not queryable in real-time
- Logs + OTEL provide immediate visibility
- Database is the source of truth for compliance
Usage Examples
Filtering in Seq
Shows all logs from: - API endpoint receiving request - Sequential pipeline orchestration - Each agent's processing - All MCP server tool calls - Final decision logging
Filtering in Azure Monitor (KQL)
Related Documentation
- ADR-057: Unified Observability and Logging Strategy
- ADR-060: Agent Persona Token Optimization
- PR #197: Implementation
- Issue #196: Enhance logging with business-level audit trail
Status: Accepted and Implemented Implementation Date: 2025-11-28 Production Deployment: Ready Next Review: 2026-02-28