The test architecture that fell out across 43 iterations — DirtFleet blog

43 iterations in, the DirtFleet test suite has 768 passing tests across 109 test files. The shape of the suite tells the story of the codebase: lots of tight contract tests that ship with the routes they protect, a smaller number of DB-integration tests that catch the bugs unit tests can't, and roughly zero end-to-end browser tests. Here's why.

The two layers we actually have

Contract tests (the bulk)

Most tests are the "here's a route, here's the shape it should return" flavor. They mock Prisma + the auth lib and assert that the route handler:

Returns the right status code for the right reason.
Passes organizationId into the query.
Forwards the right fields to the lib function it wraps.
Handles the documented error shapes (cross-tenant 404, validation 422 with fieldErrors, forgiving enums, etc.).

These tests live in tests/api/. Each route file usually gets one test file with 6-12 tests covering the documented contract. No DB is involved; vi.mock stubs everything below the route. They run in under a second per file.

The point is to catch the contract drifting, not the Prisma layer working. If GET /api/v1/assets used to return nextCursor and now doesn't, the contract test breaks. If the underlying Prisma query suddenly materializes wrong, that's the DB-integration test's job.

DB-integration tests (the safety net)

A smaller set of tests boot up a real Postgres against the test schema, seed fixtures, exercise the lib functions end-to-end, and assert against the actual DB state. These live in tests/lib/ and the few integration-style files under tests/db/.

These catch the bugs that mocked tests structurally can't: unique constraints firing the way you didn't expect, cascade deletes touching the wrong rows, Prisma quirks around connect / disconnect on nullable relations, findFirst with a compound where-clause picking the wrong row under concurrent writes. They take a few hundred ms each instead of a few ms each, but the surface they cover is real production.

What we deliberately don't have

Browser end-to-end tests

No Playwright, no Cypress, no automated "click through the signup flow." The trade-off is real: those would catch a bug or two that contract + integration miss. But they're expensive in three ways the others aren't — slow CI, flaky on edge cases that don't matter (hover timing, font load order), and high maintenance burden every time a UI element moves. For UI changes we test in a browser by hand before merging, and the contract tests guarantee the data layer behind the UI is sound.

This will probably change when we hit the first regression that browser tests would have caught and the team has the labor to maintain them. Until then: contract + integration + manual UI smoke.

Snapshot tests

No expect(thing).toMatchSnapshot(). They're a magnet for "update the snapshot because the output changed" PRs that ratchet through without anyone reading the diff. Explicit assertions force the test author to write down what they actually care about; the failure modes are clearer when a contract drifts.

100% coverage as a target

We don't track line coverage. We track contract coverage: does every documented response shape have an explicit test? Does every error code in /docs/api have a test that produces it? The coverage tool answers a question we don't care about; the documented-contract-asserted question is what we actually want.

What pays off

Mocking pattern that fell out by the third test file

Almost every API route test follows the same shape: mock@/lib/db (the Prisma client), mock @/lib/api-keys (auth), reset the mocks in beforeEach, dynamic-import the route handler inside the test (so the mocks apply before the route binds). By the third file we'd ironed out the boilerplate; by the tenth it was muscle memory. New routes get their tests scaffolded in 90 seconds.

Errors that bubble

One test we write for every write endpoint: createRepairLog: 'unexpected error must bubble — not turned into a 422'. Lib functions throw Error("Unknown asset") on cross-tenant FK lookups; the route catches those specific messages and maps them to 422. Anything else has to bubble. Without the test, it's very easy to add a catch-all that swallows real errors and reports a misleading 422.

Test files near the code they test

Tests live next to the lib file (lib/foo.test.ts) or in a parallel tests/api/ tree mirroring app/api/. When you change a route, the test you'd break is one file over. Renaming or refactoring keeps the tests with the code by accident.

What we got wrong and fixed

Around iteration 12, a few of the API tests started importing the route handler at module load time (not inside the test body). That made the route's side-effect imports — Prisma, api-keys — run before the vi.mock calls registered. Symptom: tests pass locally, fail in CI on a clean module cache. Fix: every test file does const { GET } = await import("@/app/api/v1/foo/route") inside the test, after the mocks. Documented in the pattern.

Tests are infrastructure. They aren't the product. The right amount of testing is the amount that makes you confident enough to ship without manually exercising every code path — and no more than that. 768 tests at 43 iterations means about 18 tests per iteration. That ratio is what we shoot for. Most come for free from the patterns; the rest are the ones we actually have to think about.

→ API reference · → API design retrospective · → Cross-tenant isolation