Lending AI Governance: Adverse Action, Fairness, and Replayable Credit Decisions | AI Governance
This post is part of the Regulated AI Implementation pillar.
Credit decisions are among the most governed AI outputs in existence. Consumer protection law, fair lending regulation, and model risk management guidance all impose specific requirements that go beyond standard AI performance monitoring.
The challenge is not that the requirements are unclear. It is that they are frequently misunderstood as monitoring requirements when they are actually design requirements. Adverse action documentation, fairness analysis, and replayability must be built into the system — they cannot be retrofitted after deployment.
This case study applies the Architecture of Proof framework to a production lending AI system.
What makes lending AI governance different
Three requirements distinguish lending AI from most other high-stakes AI applications.
Individual adverse action documentation. A credit denial must be explained to the specific individual who received it, using the specific factors that drove that decision. An aggregate model explanation ("the model weights income at 0.3") is not an adverse action notice. The notice must name the applicant's specific factors — and they must be factors that actually contributed to this specific decision, not features that generally matter to the model.
This requirement forces adverse action reason code generation to be a first-class system function, not a post-hoc report. The decision trace must capture which factors contributed most to the outcome at the time of decision.
Protected characteristic compliance. Lending models are legally prohibited from using protected characteristics (race, color, religion, sex, national origin, marital status, age) as inputs. But the prohibition extends to proxies — inputs that are strongly correlated with protected characteristics and serve as substitutes for them. Neighborhood, zip code, and surname patterns can introduce protected characteristic discrimination even when the protected attribute itself is excluded.
Fairness monitoring must run at the segment level — separately for protected groups — not just on overall accuracy. And it must run in production, not just on validation data.
Replayability for the full retention period. A credit decision made today may be challenged five years from now. The system must be able to reconstruct that specific decision — what inputs were used, which rules fired, which model version produced the score, and what reason codes were generated — using records created at decision time, not by re-running a current model on historical data.
Composite system design
Rules layer: - Eligibility rules: hard cutoffs that determine whether the model is called at all (e.g., minimum time in business, maximum debt-to-income ratio, geographic eligibility) - Regulatory rules: conditions that must always be enforced regardless of model score (e.g., sanctions screening, prohibited purpose checking) - Decision rules: the thresholds that translate model scores into actions (approve, decline, counter-offer at a different term)
Model layer: - Credit risk scoring model: predicts probability of default or serious delinquency over a defined horizon - Income verification model: estimates income stability from behavioral and transaction data (where permitted) - Feature attribution: generates the ranked list of contributing factors for each decision (integrated into the scoring pipeline, not computed separately)
Human layer: - Internal credit review: handles appeals, exceptions, and cases in defined gray-zone score bands - Model risk committee: conducts quarterly model reviews against performance contracts - Compliance function: monitors fair lending metrics and initiates investigation when thresholds breach
Control tier assignment
| Decision Type | Tier | Conditions |
|---|---|---|
| Auto-approve | Tier 2 | Score above approval threshold + all eligibility rules pass + no regulatory flags |
| Auto-decline | Tier 2 | Score below decline threshold + no eligible counter-offer |
| Counter-offer routing | Tier 1 | Score in counter-offer band; system generates offer terms, human reviews before send |
| Manual review | Tier 0/3 | Score in gray zone, exception request, regulatory flag, or applicant appeal |
| Fraud/sanctions flag | Tier 3 | Always human: no automated credit action on flagged applications |
Adverse action reason code architecture
This is the most frequently misdesigned element of lending AI systems.
Adverse action reason codes must be generated at the time of decision, using the actual inputs and model state that produced the outcome. They cannot be generated retroactively from a current model, because the model may have been retrained and will produce different feature attributions for the same input.
Implementation requirements: 1. Feature attribution is computed as part of the scoring pipeline — not as a separate post-hoc process 2. The top N adverse factors (typically 4) are captured in the decision trace along with their direction of impact 3. Factor codes are mapped to human-readable reason code language at decision time 4. The reason code set is reviewed by legal to ensure ECOA/Regulation B compliance
Worked example — adverse action trace:
{
"decision_id": "app-20240315-7782B",
"applicant_id_hash": "sha256:a3f8b2...",
"decision": "DECLINED",
"model_score": 0.31,
"score_threshold_for_approval": 0.55,
"adverse_factors_ranked": [
{"factor_code": "AF-018", "label": "Insufficient time in current employment", "impact": "negative"},
{"factor_code": "AF-042", "label": "High ratio of debt to income", "impact": "negative"},
{"factor_code": "AF-007", "label": "Limited credit history length", "impact": "negative"},
{"factor_code": "AF-029", "label": "Recent delinquency on existing account", "impact": "negative"}
],
"model_version": "risk-credit-v3.1.0",
"reason_code_schema_version": "ecoa-v4",
"payload_ref_id": "s3://lending-traces/2024/03/7782B.json.gz"
}
This record is sufficient to generate a complaint-compliant adverse action notice for this applicant for as long as the regulatory retention window requires.
Fairness monitoring design
Fair lending monitoring must run at the segment level, not just on aggregate metrics.
Minimum monitoring requirements:
- Approval rate disparity by protected class segments (sex, race/ethnicity proxy) — investigate any segment with an approval rate more than 20 percentage points below the most-favored group
- Average score disparity — detect if protected characteristic groups receive systematically lower scores on equivalent credit profiles
- Reason code disparity — detect if protected groups receive systematically different reason codes that are not explained by credit risk differences
- Counter-offer rate disparity — detect if protected groups are more frequently steered toward higher-rate offers
Monitoring frequency: monthly at minimum, with quarterly deep-dive analysis.
Trigger for investigation: any segment showing a 20%+ disparity in approval rate or pricing vs. the reference group, with no credit-risk explanation.
Replayability for regulatory retention
The full decision record — including model version, inputs, rules that fired, and reason codes — must be retained for the regulatory minimum (typically 25 months for ECOA; longer for enterprise risk purposes).
Key design requirement: the model version must be tagged in every decision record and the model artifact must be preserved — not just the weights, but the full scoring pipeline including feature engineering. If the pipeline cannot be restored to the state it was in at decision time, you cannot replay the decision.
This is why version-controlled model registry is not optional in regulated lending — it is a regulatory compliance requirement.
Related reading
- Regulated AI Implementation: The full pillar covering lending, fraud, healthcare, and claims governance.
- AI Audit Trails: Building the replayable decision trace that adverse action compliance requires.
- Composite Accountability: Designing the component contracts for a lending credit decision system.
- The Cost of Proof: The ROI case for building defensible AI systems in regulated industries.
Download the Architecture of Proof Checklist
Ready to implement? Get the definitive checklist for building verifiable AI systems.