$ pnpm testRunning 142 test suites...✓ Header component renders✓ Auth flow passes✓ Checkout validationStatus: PASS (142/142)// Production 2 hours later:Uncaught TypeError: Cannot read properties of undefined (reading 'price')
Your CI runs tests.Ours uses your app.
An agent uses your app like a real customer on every PR. You get a confidence score grounded in what actually broke.
We named it FirstCustomer because that's what it is — the first user to try every change, before any real one does.
Problem
Tests know assertions. They do not know the customer path.
Traditional CI can tell you that mocked fixtures pass. It cannot tell you that a pricing change broke a guest checkout path that real users hit every day.
> Running PR #4802 as persona: GuestUser→ land / → add to cart → /checkout△ Path failed at /checkout (step 3 of 4)File: src/components/Checkout.tsxIssue: 'price' undefined for guest session. Unit tests passed against a mocked data shape the live flow never produces.Confidence Score: 94%Status: BLOCK (revenue path broke)
How it works
The pipeline learns the app before it judges the PR.
Four steps, no new surface area: ingest the diff, project it onto a graph the agent built by using the app, run the affected flows as a real persona, and report the exact path that broke.
01 / Diff intake
Read the change as intent
FirstCustomer parses changed files, routes, components, API contracts, and test hints before CI turns the PR into a pass/fail binary.
02 / App graph
Map touched code to lived paths
An onboarding crawl drives the app like a real user — clicking, navigating, signing up — to build a persistent interaction graph: checkout flows, auth edges, billing detours, empty states, and integrations. The diff is then projected onto it.
03 / Persona run
Use the app as a real customer
A persona-driven agent boots the PR build in a real browser and walks the flows your diff touched — a confused first-time user hitting checkout, a returning customer with a stored card, an empty-cart edge case. The score is the output of execution, not static analysis.
04 / CI decision
Block only when it matters
Engineers get the exact path the agent took, where it broke, the persona that broke it, and the files that caused it. No new dashboard to babysit.
Personas
The same PR. Four customers. Different verdicts.
One pass/fail can't describe what a release does to real users. We run each PR as a cast — every persona has an intent, walks the flows your diff touched, and reports back what they actually experienced.
First-time guest
no account, mobile, in a hurry
Intent
browse → add to cart → guest checkout
- 01lands on /
- 02adds item
- 03hits /checkout
- 04enters card
Verdict
price undefined for guest session
Returning customer
logged in, stored card, desktop
Intent
reorder a previous item
- 01signs in
- 02reopens last order
- 03one-click reorder
Verdict
completed in 2.4s, receipt sent
Coupon hunter
stacked promo codes, edge cases
Intent
apply expired + valid coupon together
- 01adds item
- 02applies CODE1
- 03applies CODE2
- 04checkout
Verdict
stacked coupons crash price resolver
Empty-cart wanderer
exploring, never commits
Intent
open checkout with nothing in cart
- 01opens /cart
- 02navigates /checkout directly
Verdict
untouched by this diff
The Graph
We map the reality of your codebase.
The graph links user flows to components, API boundaries, routes, and external systems. A diff that touches the checkout component is no longer a file change. It is a change to a revenue path.
App graph
One diff. One customer path. One gate.
session contract
price resolver
PR #4802 touched
undefined price
revenue path
post-purchase
What it watches
Quiet infrastructure for teams that do not want another flaky gate.
Pricing
Join the early access.
We are onboarding a small number of high-velocity engineering teams that ship product through CI every day.