Ticket in. Shipped code out.
No PR queue. No diff review. The agent ships small, low-risk changes while you sleep — and surfaces the risky ones for you to look at.
You write the ticket
Sentence, paragraph, or just a screenshot. Drop it in GitHub Issues, ClickUp, or Specship.
Agent classifies risk + ships
Green/amber/red tier based on diff size, files touched, and your forbidden globs. Tests green + green tier = merged.
Plain-English changelog
No diff to read. No PR review queue. Just "Login flow shipped · Pricing page · Payment route needs your review."
The agent knows when to ask first.
Every diff gets graded before merge. Green ships itself. Amber asks for a glance. Red sits in your inbox until you sign off.
Small, isolated, fully tested.
Card-generator graded green, no forbidden-glob matches, evaluator grade ≥ B, all CI checks green. Ships on those signals.
- ·Copy edit on the marketing site
- ·New input variant in the design system
- ·README + ENV.example update
Medium-impact, but contained.
Multi-file refactor or a new feature flag. You get a one-paragraph summary; merge it in a tap.
- ·New API endpoint behind a flag
- ·Extracted a hook into shared lib
- ·Bumped a non-breaking dependency
Touches money, identity, or data.
Anything that could drop a row, charge a card, or change auth. Specship will not merge these — ever.
- ·Payment / billing routes
- ·Schema migrations
- ·Anything in your forbidden globs
Auto doesn’t mean unsupervised.
Founder Mode keeps you out of the easy stuff. It keeps you firmly in the loop on the things that matter. The list below is non-configurable — you can add to it, never take from it.
- Payment, billing, or checkout routes
- Authentication, sessions, OAuth scopes, RBAC
- Database migrations / schema changes
- Anything matching your forbidden globs (.env, secrets/**, prod config)
- Anything the evaluator grades C or lower against the acceptance criteria
- Anything Gemini Code Assist or CodeRabbit flags blocking
feat(billing): add Stripe webhook handler for invoice.paid
Questions you’re probably going to ask.
What if the AI ships a bug?
Every merged change is reverted with one click — Specship keeps the branch and the test diff for every PR. Founder Mode only ships diffs where tests went from red → green; if a regression slips past the test suite, it’s the same kind of bug a human review would have missed too. Roll back, comment, iterate — the agent treats your revert as a new ticket.
Can I turn this off per repo?
Yes. Founder Mode is per-repo and per-branch. You can leave it off for one repo, allow only the green tier on another, and run full auto on a third. Default for new repos is off — you have to explicitly opt in.
Does my code still get tested?
More than before. TDD is non-negotiable: the agent writes failing tests first, then implements. Every PR carries a coverage report — wire that into your CI gate to block merges below your threshold. Founder Mode does not skip steps; it just removes the human-review step on diffs that would have been a rubber-stamp anyway.
How does it decide what's low-risk?
Three signals: the card-generator’s risk tier (green / amber / red, set when the ticket is authored), your forbidden-glob list, and an LLM evaluator grade against the acceptance criteria. All three have to agree on "green" for auto-merge. Defaults are conservative — you can loosen, never tighten past your own ceiling.
What happens to tickets that depended on a merged one?
They unblock automatically. Auto-merged tickets fan out and re-queue every downstream ticket that was waiting on them — so a small queue of related work clears itself without you babysitting dependency order.
Will Gemini Code Assist still review?
Yes — Founder Mode respects every external reviewer you wire up. If Gemini, CodeRabbit, or a human teammate leaves a blocking comment, the merge pauses and the agent addresses each comment one-by-one with per-comment fixed/won’t-fix replies before retrying.