Dev Skills Gsd

GSD + Agent Skills Hybrid Plugin for Claude Code

A Claude Code plugin combining GSD's execution orchestration (wave-based parallel agents, context monitoring, state management) with Agent Skills' quality enforcement (phase gates, anti-rationalization tables, 21 skills) into one unified development workflow.

01 — The Problem

AI coding agents have two independent failure modes that current frameworks don't solve together: quality shortcuts (agents skip specs, tests, security — solved by Agent Skills' phase gates) and context degradation over long sessions (solved by GSD's fresh context per task and file-based state). Neither framework alone is sufficient — Agent Skills has no execution engine, and GSD has no quality opinion.

02 — What the AI Does

This project is pre-code — it contains the full specification and architecture plan for building the plugin, but no implementation yet. The plan defines: a six-phase workflow (DISCUSS → PLAN → EXECUTE → VERIFY → REVIEW → SHIP), 24 specialized agents (21 from GSD + 3 from Agent Skills specialists: code-reviewer, test-engineer, security-auditor), on-demand skill loading per phase, file-based state management via `.planning/` directory, anti-rationalization tables at every gate, and Nyquist test-coverage validation. Target: under 5-minute install, Node.js 22+. Built on: GSD (orchestration layer), Agent Skills (quality layer), plugin architecture for Claude Code.

03 — Design Decisions

01 · Choice

Six-phase workflow with REVIEW as the key addition

Why

GSD goes straight from "does it work?" to "ship it." Agent Skills adds the critical "is it good?" phase between those two. REVIEW covers code quality, security hardening, performance, and simplification — the difference between shipping code that functions and code that lasts. **[Creator: add rationale]** for whether this phase sequence was tested against any real project before committing to it in the spec.

02 · Choice

On-demand skill loading per phase, never all 21 at once

Why

Loading all 21 Agent Skills skills simultaneously would blow the context budget. The loader binds skills to phases — only the skills relevant to the current phase are injected into the agent context.

Constraint

Token cost of loading the larger skills (security-and-hardening, shipping-and-launch) is the highest-identified risk and needs measurement in Phase 1 before building the loader.

03 · Choice

File-based state via GSD's `.planning/` directory pattern

Why

Git-backed state persistence means the planning context survives session boundaries. Agents read/write state files rather than relying on in-memory context. This is the core context durability mechanism.

Constraint

The state management code doesn't exist yet — it's a WBS item (1.2) that must be built before phase commands (2.0) can be implemented.

04 · Choice

Integration over reinvention

Why

Both source frameworks are MIT licensed and actively maintained. Building on top of them means the plugin benefits from upstream improvements automatically, and the implementation surface is smaller (wiring existing pieces together, not rewriting them).

Constraint

The plugin is constrained to integration — not reinvention of either framework. Any feature that requires rewriting GSD or Agent Skills is out of scope.

05 · Choice

Self-bootstrapping development strategy

Why

Once the DISCUSS and PLAN commands are functional, the plugin uses itself to plan the remaining phases. This validates whether the architecture holds under actual use — if the workflow can't build itself, it's probably too complex.

Constraint

This only works if DISCUSS and PLAN are built first and actually functional. The bootstrapping strategy depends on those two commands being correct, creating pressure to get them right early.

06 · Choice

Critical path runs through Phase Commands (WBS 2.0)

Why

The commands are where the two frameworks' philosophies collide — that's where integration design gets proven or broken. Building DISCUSS first validates the full pattern: command → loads skills → spawns agents → gates approval.

Constraint

If DISCUSS fails, phases 3.0 (agents) and 4.0 (skills) which run in parallel once 1.0 is complete will also need redesign.

07 · Choice

Multi-runtime compatibility deferred to v1.1

Why

Getting the architecture right for Claude Code first is already complex enough. Cursor/Codex adapter layer is explicitly out of the v1 scope.

Constraint

Users on other runtimes must wait. If Claude Code changes its plugin API, the adapter layer design may need revision.

04 — Tradeoffs & Limits

- **Project is pre-code.** The spec, architecture diagrams, work breakdown structure, and risk register exist. No implementation code exists. This is a planning artifact, not a working tool. - **Token budget is the highest structural risk.** The spec identifies that loading Agent Skills' richer skill files could blow the context budget. If they can't be loaded without truncation, the entire on-demand loading architecture needs redesign. - **24 agents create coordination complexity.** Even with GSD's wave-based execution pattern, managing 24 agents with scoped tool permissions adds complexity. The spec acknowledges this as a medium-high risk. - **Self-bootstrapping creates a chicken-and-egg problem.** The plugin can't use itself to build Phase 2+ features until Phase 1 (DISCUSS/PLAN) is complete. Early development must proceed manually. - **No dogfood validation yet.** The integration has never been run through an actual 6-phase build cycle. Dogfooding is WBS item 5.3 — it's the real test of whether the architecture works. - **REVIEW phase design is underspecified.** The spec asks "what's the right granularity — one command with 4 sub-steps or 4 separate commands?" This is an open design question that needs a decision before building 2.6.

05 — Key Insight

The hardest integration challenge in combining GSD and Agent Skills isn't technical — it's philosophical. GSD's philosophy is "ship fast and iterate." Agent Skills' philosophy is "verify before advancing." The phase gate between VERIFY and REVIEW is where these philosophies most directly conflict: how do you enforce a quality gate without turning every verification failure into an infinite loop? Resolving that tension defines the architecture.