OPC Operating System v1.0

The Zeus Playbook

Project-agnostic methodology for AI-native startups. Validated against Anthropic's Founder's Playbook (May 2026), threaded through our GBrain, FEASY Room, War Room, and stage gates. One system, any project.

How We Operate

🧠

Founder = Orchestrator

M~=Vision. Zeus=Hands+Devil's Advocate. AI agents execute. The founder directs systems, not tasks. Every stage has a human checkpoint.

πŸ›‘οΈ

Validate Before Build

No line of code before the evidence justifies it. AI compresses build time β€” making it dangerously easy to skip validation. We don't.

βš”οΈ

Adversarial by Default

AI amplifies confirmation bias. At every gate, we run adversarial checks: argue against, find disconfirming evidence, competitor neglect exercise.

πŸ“

Context Prevents Drift

Agentic tech debt kills. Every session ends with updated context docs. 5 minutes of documentation = insurance against architectural drift that compounds silently.

🎯

Stage Gates, Not Vibes

We don't advance on gut feel. Each stage has explicit exit criteria: measurable, adversarially tested, with false-positive definitions built in.

πŸ”„

Methodology > Project

Every framework, gate, and checklist is project-agnostic. What we learn on KNQX feeds the next project. The operating system compounds.

4-Tier Memory Architecture

Persistent knowledge that survives across sessions. The right tier for the right content β€” no duplication, no bloat.

L0
GBrain
Knowledge graph. Project facts, domain expertise, competitive intel. Queried on demand. Unlimited capacity.
L1
Core Memory
Behavioral directives only. Who we are, how we communicate. Always in-context. ~400 chars target.
L2
Project README
Active project context. Tech stack, decisions, NEXT_STEP. Read before every session.
L3
log.md
Detailed session history. What was tried, what failed, decisions. Read only when deep context needed.
The Rule

If a fact lives in GBrain, it does NOT go in L1. L1 costs tokens every turn. GBrain costs zero until queried. The duplication test: search GBrain first. If found, don't duplicate.

Framework Arsenal

We don't just have opinions β€” we have instruments.

🎯

FEASY Room

8-lens rapid feasibility scoring. Market demand, size, competition, capital, cost, complexity, time-to-revenue, risk. Quantified + radar chart.

🎩

War Room

6 Thinking Hats. Full-spectrum perspective analysis: facts, emotion, caution, optimism, creativity, process. Run at every stage gate.

πŸ”€

SCAMPER Room

7 mutation verbs. Substitute, Combine, Adapt, Modify, Put to Other Use, Eliminate, Reverse. For ideation, not analysis.

πŸ“‹

Stage Gates

Explicit exit criteria for Idea→MVP→Launch→Scale. No advancing without passing. Adversarial check mandatory at every gate.

🧠

GBrain

Persistent knowledge graph. Domain expertise, competitive intel, project context. Semantic search + wikilinks. The compounding memory.

🎨

Brainstorming Gate

Before ANY creative work: explore intent, propose 2-3 approaches, get approval. Hard gate β€” no skipping to implementation.

⚑

gstack

Full dev lifecycle framework: plan reviews (/plan-ceo-review, /plan-eng-review, /plan-design-review), QA (/qa, /qa-only), code review (/review), ship (/ship), deploy verification (/canary), retros (/retro), context persistence (/context-save, /context-restore), and more.

Room Best For When to Run
FEASY Room Scoring feasibility, comparing ideas Idea stage entry, gate review
War Room Full-spectrum perspective on a decision Every stage gate
SCAMPER Room Forcing mutations on a concept Post-validation ideation
Stage Gates Go/no-go decision before advancing End of each stage
gstack Dev lifecycle execution: plan reviews, QA, ship, canary, retros MVP stage onward (build/ship cycle)

gstack: The Stage-by-Stage Toolkit

gstack is our execution layer β€” the CLI skills that turn the playbook's principles into action. Here's which gstack commands map to each stage:

Stage gstack Command Purpose
πŸ’‘ Idea /office-hours YC-style forcing questions: biggest risk, simplest test, kill criteria
/plan-ceo-review Founder-mode plan review: rethink the problem, find the 10x path
/design-consultation Understand the product, research competition, propose design direction
πŸ”§ MVP /plan-eng-review Lock in execution plan: architecture, scope, sprint sequence
/qa Systematic QA testing β€” find bugs, fix them, capture evidence
/review Pre-landing code review: security, quality, auto-fix
/ship Detect base branch, merge, run tests, create PR
/context-save Save session context β€” the 5-minute antidote to agentic tech debt
πŸš€ Launch /canary Post-deploy monitoring: watch live app, alert on regressions
/devex-review Audit developer experience: time-to-hello-world, friction, docs
/design-review Visual audit: spacing, consistency, responsive, accessibility
/benchmark Performance regression detection
πŸ“ˆ Scale /retro Weekly retrospective: commit analysis, velocity, blockers
/health Code quality dashboard: test coverage, lint, type-check
/document-generate Auto-generate missing docs from code
/document-release Post-ship docs update: changelog, API docs, migration guides
All Stages /investigate Systematic debugging with root cause analysis
/careful or /guard Safety modes: destructive command warnings, full safety guardrails
Key Insight

The playbook says "agentic tech debt kills." gstack's /context-save and /context-restore are the operational implementation of that principle. Every session ends with /context-save. Every new session starts with /context-restore. No context drift.

The 4-Stage Flow

AI compresses quarters into weeks. These gates are the braking system that prevents you from building fast in the wrong direction.

πŸ’‘
Idea
"Is this worth building?"
πŸ”§
MVP
"What should we build first?"
πŸš€
Launch
"Does the business deserve to grow?"
πŸ“ˆ
Scale
"Is it sustainable without me?"
πŸ’‘ Stage 1

Validate Before You Build

The most important work happens before any code. AI makes it dangerously easy to skip this and jump to building. We don't.

What We Do

What We Do NOT Do

🚧 Exit Gate β€” Idea β†’ MVP

Problem-Solution Fit

All 3 must be YES to advance:

  • Is the problem real and specific? Can you name exactly who experiences it, how often, how severely, and what they currently do about it.
  • Does your solution address the actual problem? Not the problem you assumed β€” the one validation revealed.
  • Enough signal to justify building? Not certainty β€” but enough qualitative evidence that committing to MVP is reasoned, not faith.
βš”οΈ Adversarial Check (Mandatory Before Advancing)
  • Competitor neglect exercise: "Make the most compelling argument for why a competitor in this space would succeed while we do not." Accept the uncomfortable answer.
  • Confirmation bias check: Run War Room White Hat β€” what does the data actually say, stripped of narrative? If supporting evidence >> challenging evidence, ask: does that reflect reality, or what you hoped to find?
  • Disconfirming evidence search: Actively seek: failed competitors, negative market signals, structural obstacles, customer behavior patterns that contradict your thesis.
Anthropic Playbook Insight

"Ask AI to validate your startup idea and it will find supporting evidence; ask it to size your TAM and it will find the number that makes it look fundable. The antidote is the same tool, pointed in the opposite direction."

πŸ”§ Stage 2

Build the Smallest Thing That Generates Real Evidence

The MVP stage is still an evidence-gathering exercise. We're gathering evidence about the solution, not the problem. Ship the smallest focused iteration that puts a real solution in front of real users.

What We Do

What We Do NOT Do

🚧 Exit Gate β€” MVP β†’ Launch

Product-Market Fit

Genuine evidence, not launch energy. Pass at least ONE:

  • Sean Ellis Test: 40%+ of active users say they'd be "very disappointed" if the product disappeared.
  • Effort Test: Product pulls users in (post-PMF) instead of you pushing them (pre-PMF). When outreach, incentives, and personal follow-up stop being necessary for retention.
  • Retention + Revenue: Users return without prompting. Day 7 and Day 30 hit pre-defined benchmarks. Revenue follows usage, not pushing.
βš”οΈ Adversarial Check (Mandatory Before Advancing)
  • False positive definition: Before looking at data, write down what a FALSE positive looks like for this product (signups without activation, revenue without retention, initial enthusiasm that fades by week 6).
  • Skeptic's audit: "What would a skeptic say about these numbers?" Run the numbers through Black Hat (War Room).
  • Scope audit: Has the product sprawled beyond its original scope definition? If every feature took an afternoon, how many "just one more thing" have accumulated?
πŸš€ Stage 3

Prove the Business Deserves to Grow

MVP proved the product deserves to exist. Launch proves the business deserves to grow. Harden infrastructure, build operational systems, remove founder bottlenecks.

What We Do

What We Do NOT Do

🚧 Exit Gate β€” Launch β†’ Scale

Repeatable, Hardened, Self-Running

All 3 must be TRUE:

  • Growth is repeatable & channel-driven. CAC, LTV, payback period β€” understood unit economics.
  • Product handles production workloads. Infrastructure hardened. Security & compliance in order. Reliability holds under real conditions.
  • Ops run without founder bottlenecks. Processes exist. Automation in place. Not personally handling support, triage, sprint planning, reporting.
πŸ“ˆ Stage 4

Sustainable Without the Founder

The product is still central, but the founder's attention expands to the company itself. Build defensible moats, mature operations, and organizational infrastructure that withstands scrutiny.

What We Do

🎯 Exit Gate β€” Scale β†’ Sustainable Business

Systematic, Auditable, Defensible

  • Systematic growth: Not founder-dependent. Auditable metrics.
  • Organizational maturity: Governance, compliance, financial controls withstand external scrutiny.
  • Defensible moat: "If a well-funded incumbent copied your product today, would your users stay?"

Typical exit forms: sustainable profitability, IPO-readiness, or acquisition.

How AI-Native Startups Fail

Failure Mode Stage What It Looks Like Our Defense
Mistaking building for validating Idea Prototype becomes "proof" without user conversations Stage gate: conversations are the evidence
Confirmation bias on steroids Idea AI finds evidence for whatever you ask it to validate War Room adversarial check at every gate
Premature scaling Idea/MVP Building ahead of demand because it's easy Scope definition before build; feature amendment criteria
Agentic tech debt MVP Codebase drifts β€” no coherent mental model, pieces never designed to fit together CLAUDE.md / README updated every session; 5-min doc rule
False PMF MVP→Launch Launch energy confused with genuine retention Sean Ellis 40% test, effort test, false positive definitions before data
Zero-friction scope creep MVP Every feature takes an afternoon, so why not? Written scope: what it does NOT do + evidence threshold for additions
Founder bottleneck Launch Decisions pile up because only the founder can make them Bottleneck audit: what stalls if you're gone a week?
Insecure by inexperience MVP Functional code β‰  secure code. No natural feedback loop for vulnerabilities. Security review before any user touches it
Expansion before readiness Launch New markets kill PMF β€” too many variables, lose ability to read data Nail original market first. Expand only with data.

Pivot Criteria

If evidence doesn't support the current direction after 3+ iteration cycles, run a diagnostic β€” not a hope cycle.

πŸ” 3-Iteration Diagnostic

Three Questions to Answer

  • Is there a segment responding differently? Often the right audience is already in your data, just underweighted.
  • Is the gap a positioning problem or a product problem? Adjust messaging vs. rebuild.
  • What would have to be true for PMF? Is that scenario realistic given what you're seeing?

Let the answers determine: adjust, pivot, or return to Idea stage.

The Moat Playbook

At scale, the question is: "If a well-funded incumbent copied your product today, would your users stay?" Build moats deliberately from day one.

πŸ—οΈ

Accumulated Depth

Edge cases, domain logic, industry gotchas that competitors can't replicate. Every bug workaround, every 340B-style exception = a test case = a map of your moat.

πŸ”—

Integration Depth

Workflows built on top of your product: automations, trained teams, connected data sources. Switching becomes an operational project, not a product decision.

πŸ“Š

Data Flywheel

User behavioral signals compound improvement. Time-locked, context-specific, impossible to buy. A well-resourced competitor starting today can't replicate 2 years of usage data.

🧩

Workflow Lock-In

Prompts, automations, standardized outputs shaped around your product. The more integrations, the more surface area for building moat. APIs, webhooks, SDKs.

Sources & Integration

Source What We Adopted Where It Lives
Anthropic Founder's Playbook (May 2026) Stage gates, adversarial validation protocol, agentic tech debt concept, measurement framework, moat playbook, scope definition discipline GBrain: founders-playbook-2026
OPC Operating Model M~=Vision, Zeus=MD+Hands+Devil's Advocate. Founder orchestrates, agents execute. Core memory (L1)
3-Tier Memory Architecture 4-layer knowledge persistence. GBrain β†’ Memory β†’ README β†’ log.md. Duplication test. Skill: 3-tier-memory
FEASY Room 8-lens feasibility scoring + radar chart. Run at Idea gate. Skill: opportunity-scorecard
War Room (6 Hats) Full-spectrum perspective coverage. Run at every gate. Skill: opc-war-room
SCAMPER Room 7 mutation verbs for ideation. Post-validation, not pre-. Skill: scamper-room
Brainstorming Hard Gate No skipping to implementation. Always design before build. Skill: brainstorming
Stage Gates Explicit exit criteria per stage. Adversarial check mandatory. Skill: startup-stage-gates
Verification Before Completion No claiming done until evidence confirms. Skill: verification-before-completion
The Compound Insight

Every project runs the same operating system. KNQX validates the gates, FEASY scores the ideas, War Room pressure-tests the decisions, GBrain remembers everything. The methodology gets sharper with every project β€” that's the real moat for our OPC.