Fraud detection AI sits at the intersection of high-stakes autonomy and high-volume decisions. This case study shows how to apply the Architecture of Proof framework to a production fraud system — tiers, circuit breakers, trace design, and dispute resolution.

Fraud Detection AI Governance: A Case Study in High-Stakes Autonomy | AI Governance

This post is part of the Regulated AI Implementation pillar.

Fraud detection is one of the clearest governance challenges in production AI. The volume is high, the stakes are real on both sides of the decision, and the consequences of error are asymmetric in a way that makes standard accuracy metrics insufficient.

Miss a fraudulent transaction: financial loss. Block a legitimate transaction: a damaged customer relationship, a potential dispute, and in some contexts, a regulatory event.

This case study walks through how to apply the Architecture of Proof framework to a production fraud detection system — from control tier assignment through circuit breaker design, decision trace architecture, and dispute resolution.

The governance challenge this system must solve

A production fraud detection system must answer five questions correctly at every decision:

  1. Is this transaction authorized or fraudulent?
  2. If uncertain, what is the right response — block, challenge, or approve?
  3. If we block, can we explain why to the customer and to a regulator?
  4. If our blocking rate on legitimate transactions rises, do we detect it before it escalates?
  5. If a customer disputes a block, can we reconstruct the decision without re-running the model?

Standard AI deployment answers question one. The Architecture of Proof framework answers all five.

Composite intelligence design

A fraud detection system is a composite system — rules, models, and humans each play a defined role.

Rules layer: - Velocity rules: block transactions exceeding defined rate limits (e.g., 5 transactions in 60 seconds on a new card) - List-based rules: block transactions from IP addresses on active sanctions or known fraud lists - Structural rules: reject any transaction with missing or structurally invalid fields before scoring - Policy rules: always require step-up authentication for transactions above a defined dollar threshold

Model layer: - Real-time fraud scoring model: produces a fraud probability score (0–1) for each transaction - Device fingerprinting model: produces a device risk score - Behavioral model: detects anomalies in session behavior vs. the account's historical pattern

Human layer: - Fraud analyst team: reviews high-value escalated blocks, investigates organized fraud patterns, manages dispute resolution - Model risk function: performs monthly model review against performance contracts

Control tier assignment

Not all fraud responses operate at the same autonomy level.

Decision Type Control Tier Rationale
Block — high-confidence fraud (score > 0.92 + velocity rule) Tier 2 Clear signal, reversible via dispute, circuit-breaker protected
Challenge — medium-confidence (score 0.65–0.92) Tier 1 Step-up auth; human completes the authorization
Approve with monitoring (score < 0.65, no rule violations) Tier 1 Logged for behavioral monitoring; no human required
Escalate — high-value + medium-confidence (> $10K + score 0.5–0.8) Tier 0/3 Analyst review required before any action

The tier boundaries are set at deployment, documented, and changed only with evidence and sign-off — not adjusted ad hoc in response to individual cases.

Component contracts

Rules contracts:

"The velocity rule must fire for 100% of transactions that meet the defined rate limit criteria. Zero exceptions. Bypass via any channel (web, API, partner integration) constitutes a rule violation."

"Sanctions list matching must run before model scoring on every transaction. Any failure to access the list within 200ms must trigger a fallback to Tier 3 (human review) for the affected transactions."

Model contracts:

"The fraud model must maintain a false positive rate below 1.2% on non-fraudulent transactions in any 7-day rolling window. If FPR exceeds 1.8%, hard-block behavior must automatically downgrade to challenge-only pending review."

"The model must not be used for transaction types outside the defined training population (currently: card-present and card-not-present retail; excluded: ACH, wire, crypto). Routing violations must be logged."

Human contracts:

"Escalated blocks must be reviewed within 2 hours. Analyst must select a disposition code (confirm-block, reverse-block, escalate-further). Cases with no disposition within 4 hours are automatically reversed and logged as a SLA violation."

Circuit breaker design

Circuit breakers are the mechanism that prevents a model performance issue from becoming a customer impact event before it is detected.

Primary circuit breaker: - Trigger: 7-day rolling false positive rate exceeds 1.8% - Action: Auto-downgrade all Tier 2 hard blocks to Tier 1 challenges - Notification: immediate alert to fraud operations lead and model risk - Resolution: model risk review completes within 48 hours before Tier 2 is restored

Secondary circuit breaker: - Trigger: transaction block volume increases more than 40% week-over-week with flat fraud confirmation rate - Action: block threshold raised by 0.05 (more conservative blocking), alert to model risk - Resolution: investigation within 24 hours

Tertiary circuit breaker: - Trigger: any sanctions list access failure exceeding 500ms - Action: transactions that cannot be checked auto-route to Tier 3 (human review) - Resolution: engineering incident response

Decision trace design

Every blocking decision must produce a trace that answers a customer dispute or regulatory inquiry within hours — not days.

Minimum required trace fields:

{
  "decision_id": "txn-20240315-4892A",
  "timestamp": "2024-03-15T09:42:17Z",
  "transaction_amount": 847.50,
  "merchant_category": "electronics",
  "control_tier_at_decision": 2,
  "rules_fired": [
    {"rule_id": "VEL-003", "result": "pass"},
    {"rule_id": "SANC-001", "result": "pass"}
  ],
  "model_scores": {
    "fraud_model_id": "fraud-v2.4.1",
    "fraud_score": 0.94,
    "device_risk_score": 0.81
  },
  "decision": "BLOCKED",
  "block_reason_category": "high_fraud_score",
  "payload_sha256": "a3f8b2...",
  "payload_ref_id": "s3://fraud-traces/2024/03/4892A.json.gz"
}

This trace is sufficient to answer: which rules ran, which model version scored, what the score was, and what control tier authorized the block — for any transaction within the retention window.

Dispute resolution process

A customer dispute on a fraud block has a defined SLA and a defined evidence process:

  1. Customer initiates dispute via support channel
  2. Support agent queries decision trace using transaction ID (response within 30 seconds)
  3. Agent reviews: was the block rule-triggered or model-triggered? What was the score?
  4. If rule-triggered: agent confirms rule basis with customer and documents outcome
  5. If model-triggered: agent escalates to fraud analyst with pre-populated trace context
  6. Analyst reviews full trace, makes disposition decision within 4 hours
  7. Outcome logged, used in monthly false positive analysis

No step in this process requires re-running the model on current data. The trace contains all the state needed for the review.

What good looks like in production

A well-governed fraud detection system has: - FPR < 1.5% with circuit breakers that activate before it reaches 2% - Every blocking decision traceable in under 60 seconds from the decision ID - Tier boundaries documented, signed off, and version-controlled - Monthly model review with published findings - Dispute resolution SLA met at > 95% without requiring engineering involvement


Download the Architecture of Proof Checklist

Ready to implement? Get the definitive checklist for building verifiable AI systems.

Zoomed image
Free Download

Downloading Resource

Enter your email to get instant access. No spam — only occasional updates from Architecture of Proof.

Success

Link Sent

Great! We've sent the download link to your email. Please check your inbox.