Channel Coach

YouTube Coaching Knowledge Engine

Scrapes and synthesizes long-form video content from selected YouTube channels into actionable coaching playbooks, so coaches stay current without watching every episode.

01 — The Problem

Coaches and advisors need to stay current on the ideas, frameworks, and tactics promoted by the experts they follow — but watching every podcast, interview, and YouTube video is time-prohibitive. The knowledge lives in hours of video, scattered, unorganized, and hard to act on. Coaches either fall behind or spend more time consuming than coaching.

02 — What the AI Does

Monitors YouTube channels (Dan Martell, Joe Hudson, and others) on a 12-hour cron cycle. For each new episode over 2 minutes, extracts the transcript via yt-dlp, embeds the text for semantic search, generates a structured analysis (key insights, TL;DR, topics, venture relevance, action items), and writes the result to a knowledge base. Synthesizes individual episode analyses into cumulative channel playbooks. Sends Slack notifications to designated channels when new episodes are processed. Built on: yt-dlp (transcription), Python with SQLite (episode tracking), embeddings API (semantic search), Slack API (notifications).

03 — Design Decisions

01 · Choice

120-second minimum duration filter (Shorts excluded)

Why

Shorts are not representatively distinct from channel positioning and would flood the database with low-value content.

Constraint

Only episodes 2+ minutes are processed. This is enforced in monitor.py.

02 · Choice

12-hour poll interval with anchor offset

Why

Balances currency against API rate limits. 6-hour offset from LinkedIn cron means the machine's token budget is distributed across the day.

Constraint

Max 1 retry on 429 errors — no aggressive retry loops that could consume token budget.

03 · Choice

Per-channel Slack routing

Why

Each channel is tied to a specific Slack channel (#coach-dan-martell, #coach-joe-hudson) so the right coaches see relevant content.

Constraint

Slack notifications only fire if new episodes were actually captured and analyzed.

04 · Choice

Cumulative playbook synthesis

Why

Individual episode analysis is useful but insufficient — coaches want the evolved position of an expert, not a collection of hot takes. Synthesis builds a coherent picture over time.

Constraint

Synthesis runs only after episode analysis, as a separate step, to avoid synthesis without data.

05 · Choice

Notion as the primary structured database, filesystem as the knowledge base

Why

Notion provides a UI Brett can interact with directly; the filesystem provides durable markdown that survives any single tool.

Constraint

Notion stores schema (URL, Status, Channel, Duration, Topics, Ventures, TL;DR, Tags); filesystem stores the full analysis.

04 — Tradeoffs & Limits

- **Transcript quality varies by video.** yt-dlp pulls auto-generated captions when manual captions aren't available. Auto-captions have meaningful error rates, especially on technical content with jargon or accents. - **Synthesis is backward-looking.** The playbook represents what the coach has said up to now — it won't surface a new position until enough episodes have been analyzed to detect the shift. - **Shorts are excluded, but so are very short full videos.** If a channel posts a 3-minute gem, it's captured, but if it's under 2 minutes it's not — and sometimes short formats contain sharp insights. - **No cross-channel synthesis.** A coach who appears as a guest on another coach's channel doesn't have that episode captured under their channel. Cross-channel synthesis is not yet implemented. - **No video understanding beyond transcript.** Visual slides, screen recordings, diagrams — these aren't captured. If a coach says "look at this framework" and draws it on screen, that's lost. - **Embedding-based search has no logical inference.** Semantic search finds "things that sound like X" but can't reason about whether X actually applies. The retrieval is probabilistic, not deductive.

05 — Key Insight

The value of a coaching knowledge engine compounds over time — but only if the curator explicitly designs what gets excluded as aggressively as what gets included. Curation discipline (minimum duration, 12-hour cadence, single-topic channels) prevents the knowledge base from becoming a graveyard of shallow episode summaries.