BetaFounder Mode is in private beta — invite-only for now.
Founder Mode

Stop reviewing PRsyou can’t read.

Specship classifies every PR by risk. Low-risk + tests green → auto-merge. Anything risky → it pings you with a plain-English summary and waits.

app.specship.dev / changelog · this week
This week, in plain English
Specship · 4 merged · 1 paused for review · 1 in progress
Founder Mode
Login flow shipped
PR #214 merged · 3 tests added
2 min ago
Pricing page
PR #213 merged · 6 tests, 94% coverage
11 min ago
Payment route needs your review
PR #212 paused · touches /api/billing/charge
18 min ago
Avatar upload
PR #211 merged · 4 tests
1 h ago
Newsletter form (in progress)
SPEC-128 · drafting failing tests
now
Three steps

Ticket in. Shipped code out.

No PR queue. No diff review. The agent ships small, low-risk changes while you sleep — and surfaces the risky ones for you to look at.

01

You write the ticket

Sentence, paragraph, or just a screenshot. Drop it in GitHub Issues, ClickUp, or Specship.

02

Agent classifies risk + ships

Green/amber/red tier based on diff size, files touched, and your forbidden globs. Tests green + green tier = merged.

03

Plain-English changelog

No diff to read. No PR review queue. Just "Login flow shipped · Pricing page · Payment route needs your review."

Risk classification

The agent knows when to ask first.

Every diff gets graded before merge. Green ships itself. Amber asks for a glance. Red sits in your inbox until you sign off.

Green · auto-merge

Small, isolated, fully tested.

Card-generator graded green, no forbidden-glob matches, evaluator grade ≥ B, all CI checks green. Ships on those signals.

  • ·Copy edit on the marketing site
  • ·New input variant in the design system
  • ·README + ENV.example update
Amber · summary first

Medium-impact, but contained.

Multi-file refactor or a new feature flag. You get a one-paragraph summary; merge it in a tap.

  • ·New API endpoint behind a flag
  • ·Extracted a hook into shared lib
  • ·Bumped a non-breaking dependency
Red · always you

Touches money, identity, or data.

Anything that could drop a row, charge a card, or change auth. Specship will not merge these — ever.

  • ·Payment / billing routes
  • ·Schema migrations
  • ·Anything in your forbidden globs
When you’ll still get pinged

Auto doesn’t mean unsupervised.

Founder Mode keeps you out of the easy stuff. It keeps you firmly in the loop on the things that matter. The list below is non-configurable — you can add to it, never take from it.

  • Payment, billing, or checkout routes
  • Authentication, sessions, OAuth scopes, RBAC
  • Database migrations / schema changes
  • Anything matching your forbidden globs (.env, secrets/**, prod config)
  • Anything the evaluator grades C or lower against the acceptance criteria
  • Anything Gemini Code Assist or CodeRabbit flags blocking
github.com / acme / app / pull / 212 · paused for review
PAUSED · NEEDS YOUPR #212

feat(billing): add Stripe webhook handler for invoice.paid

Founder Mode classified this as red — touches app/api/billing/**.
Plain English: This PR adds a new webhook endpoint that updates invoice status when Stripe confirms payment. Tests pass, coverage is 96%, no schema change. But it’s on a billing path, so we paused. Merge or comment to iterate.
Honest answers

Questions you’re probably going to ask.

What if the AI ships a bug?

Every merged change is reverted with one click — Specship keeps the branch and the test diff for every PR. Founder Mode only ships diffs where tests went from red → green; if a regression slips past the test suite, it’s the same kind of bug a human review would have missed too. Roll back, comment, iterate — the agent treats your revert as a new ticket.

Can I turn this off per repo?

Yes. Founder Mode is per-repo and per-branch. You can leave it off for one repo, allow only the green tier on another, and run full auto on a third. Default for new repos is off — you have to explicitly opt in.

Does my code still get tested?

More than before. TDD is non-negotiable: the agent writes failing tests first, then implements. Every PR carries a coverage report — wire that into your CI gate to block merges below your threshold. Founder Mode does not skip steps; it just removes the human-review step on diffs that would have been a rubber-stamp anyway.

How does it decide what's low-risk?

Three signals: the card-generator’s risk tier (green / amber / red, set when the ticket is authored), your forbidden-glob list, and an LLM evaluator grade against the acceptance criteria. All three have to agree on "green" for auto-merge. Defaults are conservative — you can loosen, never tighten past your own ceiling.

What happens to tickets that depended on a merged one?

They unblock automatically. Auto-merged tickets fan out and re-queue every downstream ticket that was waiting on them — so a small queue of related work clears itself without you babysitting dependency order.

Will Gemini Code Assist still review?

Yes — Founder Mode respects every external reviewer you wire up. If Gemini, CodeRabbit, or a human teammate leaves a blocking comment, the merge pauses and the agent addresses each comment one-by-one with per-comment fixed/won’t-fix replies before retrying.

Now in private beta

Stop writing tickets nobody picks up.Start shipping.

Join the waitlist — we’re onboarding a few teams a week. Builders only, no procurement decks.

No credit card · We’ll email you when you’re in · Unsubscribe any time