Back to portfolio
Workflow Automation · Finance

Alert Review

AML Alert Investigation & Narrative Generation Pipeline

Financial crime alerts require hours of manual analyst review → this pipeline automatically extracts, analyzes, routes to specialized deep-dive agents, and generates a QC-validated investigation narrative → analysts receive a structured close/escalate recommendation with supporting evidence.

01The Problem

Anti-money laundering (AML) alert review is one of the most labor-intensive processes in financial compliance. Analysts must manually read raw alert packages, normalize transaction data, identify behavioral patterns, cross-reference KYC profiles, conduct OSINT research, and write structured narratives — all before making a disposition decision. Without automation, this process is slow, inconsistent across analysts, and difficult to scale during high-volume periods. The quality of the final narrative — which must be defensible for regulatory documentation — depends heavily on individual analyst skill and available time.

02What the AI Does

The pipeline performs the following tasks in sequence and in parallel: Extracts structured data (KYC profile, transaction records, alert metadata, narratives) from raw PDF alert documents using GPT-5 Mini with vision Normalizes extracted data into ISO-formatted, analysis-ready JSON (dates, amounts, counterparties, summary statistics) Classifies behavioral patterns across 11 dimensions: transaction clusters, recurring income, large transactions, ordinary spend presence, conduit indicators, cross-border activity, and routing recommendations Routes to 1–4 specialized deep-dive analysis agents (from a library of 14) based on detected patterns Executes deep-dive analyses in parallel (up to 4 concurrent) covering: wire pairing, cash structuring, source of funds, rapid movement/layering, internal transfers, counterparty relationships, cross-border risk, business purpose, payment fraud, funnel/mule indicators, profile-behavior mismatch, one-time event validation, KYC gaps, and vulnerable customer exploitation Enriches with OSINT via Perplexity Sonar Pro (occupation validation, counterparty legitimacy, adverse media) Evaluates red flags and mitigants to produce a scored risk assessment with a binary CLOSE/ESCALATE recommendation Generates a disposition-specific narrative (close or escalate) using Claude Sonnet Quality controls the narrative against source data, recalculating totals and verifying every claim Rewrites the final output into two deliverables: a concise summary brief and a full investigation narrative, using Claude Opus Models used: GPT-5 Mini (extraction, normalization), Claude Sonnet 4.5 (pattern analysis, deep-dive agents, risk evaluation, narrative generation, QC), Claude Opus 4.5 (final rewrite), Perplexity Sonar Pro (OSINT), GPT-5 Mini fallback (if Claude pattern analysis fails)

03Design Decisions

01 · Choice

Rather than running all analyses on every alert, a routing agent selects 1–4 analyses based on detected patterns, designating exactly one as "primary" (full analysis) and the rest as "secondary" (corroborative checks).

Why

Running all 14 agents on every alert would be expensive, slow, and produce noise. The routing layer mirrors how experienced analysts triage — they don't apply every framework to every case. [Creator: add rationale on whether this mirrors a specific internal methodology]

Constraint

The routing agent is explicitly prohibited from making risk or SAR decisions — it only routes. This enforces separation of concerns between pattern detection and risk judgment.

02 · Choice

Every deep-dive agent prompt explicitly prohibits risk ratings, SAR language, intent conclusions, and subjective characterizations. Agents must use neutral language: "observed," "identified," "calculated."

Why

AML narratives must be legally defensible. Premature conclusions embedded in intermediate analysis steps could contaminate the final narrative or create regulatory exposure. [Creator: add rationale on whether this reflects a specific compliance standard or legal guidance]

Constraint

across all deep-dive agents

03 · Choice

If Claude Sonnet fails on the pattern analysis step (caught via TryNode), the workflow falls back to GPT-5 Mini with an identical prompt and schema.

Why

Pattern analysis is the critical routing dependency — if it fails, the entire downstream pipeline stalls. The fallback ensures resilience without requiring human intervention.

Constraint

Both models use identical JSON schemas so downstream nodes are model-agnostic.

04 · Choice

The final rewrite produces two distinct outputs: a short summary brief (red flags, narrative, conclusion — no markdown, strict format) and a full investigation narrative (markdown allowed, 8-section structure).

Why

Different consumers have different needs — investigators triaging a queue need the summary; regulatory documentation and case file retention require the full narrative. [Creator: add rationale on whether these map to specific internal workflow steps or handoff points]

Constraint

The summary brief has explicit formatting rules (no bullets, no headings, no extra lines) enforced in the system prompt to ensure it's machine-parseable by downstream systems.

05 · Choice

A dedicated QC node recalculates totals from source data, verifies transaction counts, and checks every narrative claim before the final rewrite.

Why

LLM-generated narratives frequently introduce calculation errors or unsupported assertions. In a compliance context, an incorrect total or fabricated detail is not just a quality issue — it's a liability. [Creator: add rationale on whether this was driven by observed failure modes in testing]

Constraint

QC findings are passed as explicit inputs to the rewrite node, which is instructed to apply ALL corrections.

06 · Choice

The ExtractAlertDocument node is mocked in the sandbox scenario, bypassing live PDF extraction during testing.

Why

PDF extraction is slow and expensive; mocking it allows rapid iteration on downstream logic without re-running the full pipeline. The mock contains realistic transaction data (135 transactions, real-world patterns) to ensure downstream nodes are tested against representative inputs.

Constraint

The mock is scoped to the extraction node only — all downstream analysis runs live.

07 · Choice

Every deep-dive InlinePromptNode is wrapped with @RetryNode.wrap(max_attempts=3, delay=2).

Why

Deep-dive agents use structured JSON output with strict schemas. Transient model errors or schema validation failures are more likely with complex output requirements. Retries absorb these without surfacing errors to the user.

Constraint

Delay of 2 seconds between retries prevents thundering-herd behavior when multiple agents run in parallel.

08 · Choice

OSINT enrichment, routing/deep-dive dispatch, and pattern analysis run in parallel branches. The risk evaluation node uses MergeBehavior.AWAIT_ALL to wait for all branches.

Why

These analyses are independent and can run concurrently, reducing total wall-clock time. AWAIT_ALL ensures the risk evaluator has complete information before scoring.

Constraint

Max concurrency on the map node is set to 4, balancing speed against API rate limits.

05Key Insight

When AI handles high-stakes analytical work, the most important design decisions are not about what the AI does — they're about what it's explicitly prohibited from doing and where human judgment is preserved.