Fraud Detection AI Governance: A Case Study in High-Stakes Autonomy | AI Governance
This post is part of the Regulated AI Implementation pillar.
Fraud detection is one of the clearest governance challenges in production AI. The volume is high, the stakes are real on both sides of the decision, and the consequences of error are asymmetric in a way that makes standard accuracy metrics insufficient.
Miss a fraudulent transaction: financial loss. Block a legitimate transaction: a damaged customer relationship, a potential dispute, and in some contexts, a regulatory event.
This case study walks through how to apply the Architecture of Proof framework to a production fraud detection system — from control tier assignment through circuit breaker design, decision trace architecture, and dispute resolution.
The governance challenge this system must solve
A production fraud detection system must answer five questions correctly at every decision:
- Is this transaction authorized or fraudulent?
- If uncertain, what is the right response — block, challenge, or approve?
- If we block, can we explain why to the customer and to a regulator?
- If our blocking rate on legitimate transactions rises, do we detect it before it escalates?
- If a customer disputes a block, can we reconstruct the decision without re-running the model?
Standard AI deployment answers question one. The Architecture of Proof framework answers all five.
Composite intelligence design
A fraud detection system is a composite system — rules, models, and humans each play a defined role.
Rules layer: - Velocity rules: block transactions exceeding defined rate limits (e.g., 5 transactions in 60 seconds on a new card) - List-based rules: block transactions from IP addresses on active sanctions or known fraud lists - Structural rules: reject any transaction with missing or structurally invalid fields before scoring - Policy rules: always require step-up authentication for transactions above a defined dollar threshold
Model layer: - Real-time fraud scoring model: produces a fraud probability score (0–1) for each transaction - Device fingerprinting model: produces a device risk score - Behavioral model: detects anomalies in session behavior vs. the account's historical pattern
Human layer: - Fraud analyst team: reviews high-value escalated blocks, investigates organized fraud patterns, manages dispute resolution - Model risk function: performs monthly model review against performance contracts
Control tier assignment
Not all fraud responses operate at the same autonomy level.
| Decision Type | Control Tier | Rationale |
|---|---|---|
| Block — high-confidence fraud (score > 0.92 + velocity rule) | Tier 2 | Clear signal, reversible via dispute, circuit-breaker protected |
| Challenge — medium-confidence (score 0.65–0.92) | Tier 1 | Step-up auth; human completes the authorization |
| Approve with monitoring (score < 0.65, no rule violations) | Tier 1 | Logged for behavioral monitoring; no human required |
| Escalate — high-value + medium-confidence (> $10K + score 0.5–0.8) | Tier 0/3 | Analyst review required before any action |
The tier boundaries are set at deployment, documented, and changed only with evidence and sign-off — not adjusted ad hoc in response to individual cases.
Component contracts
Rules contracts:
"The velocity rule must fire for 100% of transactions that meet the defined rate limit criteria. Zero exceptions. Bypass via any channel (web, API, partner integration) constitutes a rule violation."
"Sanctions list matching must run before model scoring on every transaction. Any failure to access the list within 200ms must trigger a fallback to Tier 3 (human review) for the affected transactions."
Model contracts:
"The fraud model must maintain a false positive rate below 1.2% on non-fraudulent transactions in any 7-day rolling window. If FPR exceeds 1.8%, hard-block behavior must automatically downgrade to challenge-only pending review."
"The model must not be used for transaction types outside the defined training population (currently: card-present and card-not-present retail; excluded: ACH, wire, crypto). Routing violations must be logged."
Human contracts:
"Escalated blocks must be reviewed within 2 hours. Analyst must select a disposition code (confirm-block, reverse-block, escalate-further). Cases with no disposition within 4 hours are automatically reversed and logged as a SLA violation."
Circuit breaker design
Circuit breakers are the mechanism that prevents a model performance issue from becoming a customer impact event before it is detected.
Primary circuit breaker: - Trigger: 7-day rolling false positive rate exceeds 1.8% - Action: Auto-downgrade all Tier 2 hard blocks to Tier 1 challenges - Notification: immediate alert to fraud operations lead and model risk - Resolution: model risk review completes within 48 hours before Tier 2 is restored
Secondary circuit breaker: - Trigger: transaction block volume increases more than 40% week-over-week with flat fraud confirmation rate - Action: block threshold raised by 0.05 (more conservative blocking), alert to model risk - Resolution: investigation within 24 hours
Tertiary circuit breaker: - Trigger: any sanctions list access failure exceeding 500ms - Action: transactions that cannot be checked auto-route to Tier 3 (human review) - Resolution: engineering incident response
Decision trace design
Every blocking decision must produce a trace that answers a customer dispute or regulatory inquiry within hours — not days.
Minimum required trace fields:
{
"decision_id": "txn-20240315-4892A",
"timestamp": "2024-03-15T09:42:17Z",
"transaction_amount": 847.50,
"merchant_category": "electronics",
"control_tier_at_decision": 2,
"rules_fired": [
{"rule_id": "VEL-003", "result": "pass"},
{"rule_id": "SANC-001", "result": "pass"}
],
"model_scores": {
"fraud_model_id": "fraud-v2.4.1",
"fraud_score": 0.94,
"device_risk_score": 0.81
},
"decision": "BLOCKED",
"block_reason_category": "high_fraud_score",
"payload_sha256": "a3f8b2...",
"payload_ref_id": "s3://fraud-traces/2024/03/4892A.json.gz"
}
This trace is sufficient to answer: which rules ran, which model version scored, what the score was, and what control tier authorized the block — for any transaction within the retention window.
Dispute resolution process
A customer dispute on a fraud block has a defined SLA and a defined evidence process:
- Customer initiates dispute via support channel
- Support agent queries decision trace using transaction ID (response within 30 seconds)
- Agent reviews: was the block rule-triggered or model-triggered? What was the score?
- If rule-triggered: agent confirms rule basis with customer and documents outcome
- If model-triggered: agent escalates to fraud analyst with pre-populated trace context
- Analyst reviews full trace, makes disposition decision within 4 hours
- Outcome logged, used in monthly false positive analysis
No step in this process requires re-running the model on current data. The trace contains all the state needed for the review.
What good looks like in production
A well-governed fraud detection system has: - FPR < 1.5% with circuit breakers that activate before it reaches 2% - Every blocking decision traceable in under 60 seconds from the decision ID - Tier boundaries documented, signed off, and version-controlled - Monthly model review with published findings - Dispute resolution SLA met at > 95% without requiring engineering involvement
Related reading
- Regulated AI Implementation: The full pillar covering governance requirements across all regulated domains.
- Control Tiers for AI-enabled Processes: Designing the tier boundaries that this case study applies.
- AI Audit Trails: Building the decision trace architecture referenced in this case study.
- AI Escalation Protocols: The escalation and circuit breaker mechanics in detail.
Download the Architecture of Proof Checklist
Ready to implement? Get the definitive checklist for building verifiable AI systems.