Value Lab
Interview-to-Consulting Report Generation Pipeline
Raw client discovery transcripts → multi-section AI consulting report with problem analysis, solution cards, and executive summary → delivered without manual synthesis or writing.
01 — The Problem
After a client discovery session, consulting teams face a significant synthesis burden: transcripts must be analyzed, problems extracted, solutions ideated, and a polished multi-section report written — all before the next client touchpoint. This work is high-effort, time-sensitive, and requires consistent quality across engagements. Without automation, the quality and speed of report delivery depends heavily on individual consultant bandwidth and writing skill, creating variability across engagements and compressing the time available for higher-value advisory work.
02 — What the AI Does
Given a client email address and interview transcript, the pipeline: Extracts stakeholder profiles from the transcript (roles, objectives, pain points) Identifies outcome-driven problem statements using the ODI (Outcome-Driven Innovation) framework Researches the client organization via Perplexity (org snapshot + deep strategic research) Generates a job map and scores outcome statements by importance/urgency for each problem Ideates AI and process solution concepts per problem statement (mapped via a Map Node for parallel processing) Filters and deduplicates ~50–70 raw solution ideas into a ranked, relevance-mapped portfolio Formats solution cards in a structured executive format Drafts all report sections: Executive Summary, Core Themes, Background & Context, Areas of Discussion, Quick Win Solutions, Solutions Introduction, Implementation Roadmap, Appendix (Impact vs. Effort Matrix), and Cover/TOC Assembles all sections into a single final report output Models used: Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, GPT-5 Mini, Perplexity Sonar. Each node uses the model calibrated to its task complexity and cost profile.
03 — Design Decisions
Each problem statement runs its own Job Map → Outcome Analysis → Solution Ideation subworkflow in parallel (up to 5 concurrent)
Solution quality degrades when a single prompt handles all problems at once — parallel isolation forces the model to reason deeply about each job-to-be-done independently
Max concurrency capped at 5 to balance speed against API rate limits
A dedicated node (SEC07A) deduplicates ~50–70 raw solution cards before formatting, using semantic clustering then relevance scoring against problem statements
Without deduplication, downstream formatting and quick-win selection receive redundant, overlapping cards that dilute the portfolio's credibility
Output card count must match problem statement count (1:1 assignment); no silent eliminations — every card must appear in an audit trail
Two Perplexity nodes handle org snapshot and deep research, with explicit recency and attribution requirements baked into the prompts
General LLMs hallucinate or produce stale org facts; Perplexity's web-grounded retrieval enforces recency and source attribution
Prompts explicitly prohibit fabrication and require source + recency notation per finding
IDProblemStatements uses a structured ODI template ("When [executor] tries to [job], they are not adequately achieving [outcomes]...") with a validation checklist per statement
Freeform problem extraction produces vague, solution-contaminated statements; ODI syntax forces job-executor clarity and outcome specificity that downstream solution ideation depends on
Each statement must pass a 6-point validation checklist; stakeholder assignment is required for every problem
StakeholderSNAPSHOT runs before IDProblemStatements and feeds into it
Problem statements without stakeholder grounding produce generic outputs; the stakeholder list anchors problem ownership and prevents role conflation
Explicit rule to distinguish interview participants from referenced stakeholders — prevents the common error of treating mentioned roles as present speakers
Each report section is a separate node with its own system prompt, length constraints, and explicit prohibition lists
A single "write the whole report" prompt produces inconsistent section lengths, tone drift, and cross-contamination between sections; isolation enforces section-specific quality bars
Many sections have hard word/paragraph limits (e.g., Areas of Discussion: 120–150 words, one paragraph only) and explicit "do not include" lists
Some nodes reference upstream outputs via LazyReference rather than direct imports to avoid circular dependency issues in the graph
The workflow graph has complex convergence patterns; LazyReference allows referencing nodes that would otherwise create import cycles
[Creator: add rationale for specific LazyReference choices vs. direct imports]
Haiku for cover/TOC and solutions intro (low complexity, high volume); Sonnet for core themes and areas of discussion (balanced); Opus for background/context synthesis and stakeholder extraction (high nuance); GPT-5 Mini for solution ideation (instruction-following at scale); Perplexity Sonar for research
Cost and latency optimization — using Opus everywhere would be prohibitively slow and expensive for a pipeline this long
Claude models configured with max_tokens=64000; temperature set to 0 or near-0 for deterministic analytical sections, slightly higher for narrative sections
05 — Key Insight
The highest-leverage design decision in a long-chain AI pipeline is not model selection — it's enforcing structural constraints at each node that prevent upstream errors from compounding into downstream garbage.