JTBD Mapper
JTBD Outcome Statement Generator by Role
Organizational roles produce no structured innovation research → AI maps functional areas, sub-roles, and ODI-compliant outcome statements at scale → practitioners get a complete Jobs-to-be-Done research artifact without manual interview synthesis.
01 — The Problem
Conducting Jobs-to-be-Done research using Outcome-Driven Innovation methodology is labor-intensive: it requires identifying functional areas for a role, mapping every sub-role within those areas, constructing a Universal Job Map, and then generating hundreds of outcome statements that conform to strict ODI syntax. Without automation, this work is typically done by trained consultants over days or weeks, and the quality of output depends heavily on the practitioner's familiarity with ODI methodology. The process is also difficult to scale — running it for one role is expensive; running it for a dozen roles across industries and company sizes is prohibitive.
02 — What the AI Does
The system performs five sequential AI tasks, each feeding the next: Identifies functional areas — Given a role, industry, and company size, generates a structured JSON list of distinct operational domains that role oversees. Maps sub-roles — Enumerates every role and sub-role within those functional areas, with reporting relationships and responsibilities, tailored to the specific industry and company size context. Extracts a flat role array — Parses the role mapping JSON into a list of role titles for parallel processing. Generates Job Maps (per role, in parallel) — For each role, produces a Universal Job Map with all 8 ODI steps (Define → Locate → Prepare → Confirm → Execute → Monitor → Modify → Conclude), each with 5 sub-steps. Generates outcome statements (per role, in parallel) — For each sub-step in each Job Map, produces 10 ODI-compliant outcome statements following strict syntax: "Minimize the time it takes to + [object of control] + [contextual clarifier]" or "Minimize the likelihood of + [object of control] + [contextual clarifier]." Models used: gpt-5-nano-responses — used for the high-volume parallel generation tasks (Job Map structure, outcome statements) where speed and cost matter at scale gpt-5-responses (functional area identification) and gpt-5.1 (role mapping) — used for earlier, lower-volume steps where richer reasoning is warranted [Creator: confirm exact model selection rationale — was this cost optimization, capability matching, or both?] Infrastructure: Built on Vellum Workflows SDK with a Map Node enabling up to 10 concurrent subworkflow executions per run Final output is written to Azure Cosmos DB via a Code Execution Node using azure-cosmos and azure-identity packages All outputs are structured JSON throughout; json_mode: true is enforced at every prompt node
03 — Design Decisions
Every prompt node that generates outcome statements is persona-prompted as Tony Ulwick and enforces exact ODI syntax rules, not just "write outcome statements."
ODI outcome statements have a precise structure that generic LLM output violates. Without hard syntax constraints, models produce aspirational or feature-oriented statements that are methodologically invalid. [Creator: add rationale — was this based on observed model failures in testing?]
Outputs that don't follow "Minimize the [time/likelihood] it takes to + object + clarifier" structure are structurally non-compliant with ODI and unusable in downstream research.
Rather than processing roles sequentially, the workflow extracts a flat array of roles and fans out into parallel subworkflow executions (max concurrency: 10).
A single role mapping for a CFO in financial services may produce 20–40 sub-roles. Sequential processing would make the workflow prohibitively slow. Parallelism makes the tool practical at the scale it's designed for.
Each subworkflow is self-contained; it receives only item (the role title), items, and index — plus parent workflow inputs for industry and company size — preventing cross-contamination between role outputs.
Inside the Map Node, Job Map structure generation (JTBDMapStructure) runs before outcome statement generation (JTBDOutcomeStatements), with the structured JSON passed as input to the second stage.
Generating outcomes without a grounding structure produces generic, non-step-specific statements. By first producing a validated Job Map, the outcome generation prompt has precise sub-step context to work from. [Creator: add rationale — was this a quality finding from early testing?]
The outcome node receives the full Job Map JSON as input, forcing it to generate outcomes that are anchored to specific sub-steps rather than the role in the abstract.
Every prompt node uses json_mode: true; the final assembly uses a Jinja2 templating node to normalize and wrap outputs.
The output feeds directly into Cosmos DB and downstream consumption. Unstructured text would require post-processing and introduce failure points. [Creator: add rationale]
The templating node includes defensive logic to handle cases where the JTBD source is a mapping, sequence, or string — preventing silent failures when upstream JSON shape varies.
The workflow writes each JTBD item as a separate document to Azure Cosmos DB (with UUID, partition key, and PST timestamp), running in parallel with the final output node.
[Creator: add rationale — is this for downstream querying, audit trail, integration with another system?]
The write node runs in parallel with FinalOutput, meaning the workflow doesn't block on DB write success — a failure in Cosmos DB write does not prevent the workflow from returning its output.
Five pre-configured scenarios (CFO, CISO, CIO, CRO across financial services, high tech, PE-backed contexts) are defined in sandbox.py.
These represent the actual use cases the tool was built for, enabling repeatable testing across the most relevant configurations without re-entering inputs.
Scenario labels are stable, preserving mock data and test history across workflow iterations.
05 — Key Insight
When a methodology has strict formal rules (like ODI outcome syntax), the highest-leverage AI design decision is embedding those rules as hard constraints in the prompt — not trusting the model to infer them from examples — because LLMs will produce plausible-sounding but methodologically invalid outputs unless the constraint is explicit and repeated at every generation step.