All articles
Agent Engineering··26 min

What to build before you have a customer

Stage 1 is not the place for minimal architecture and not the place for future proof architecture. It is the place for additive architecture, six decisions that turn the Stage 2 transition into one migration rather than a rewrite.

What to build before you have a customer

It is a Friday in late spring. The room is small, the desk is a slab of birch, the lamp is the warm one, and the laptop screen is the only thing in the apartment loud enough to matter. The agent works. You have watched it run for the last three hours against a synthetic test set that you and your co founder built two weekends ago, and it has not embarrassed you. The retrieval is honest. The tool calls land. The product, such as it is, does the thing you said it would.

On the second monitor there is a Figma board with a schema you drew yourself last Tuesday with a stylus that ran out of battery halfway through. Five tables in pencil colored boxes. Foreign key arrows. A pgvector column on one of them, drawn slightly larger than the others, because you were proud of having added it. The env file is open in another window. OpenAI key, Postgres URL, and an Anthropic key your co founder added without telling you. Your coffee has been cold for an hour. You are shipping the agent next Wednesday. Your co founder is the only other person who will ever see the data. You have not named your first customer.

The question on the screen, and the only question that matters tonight, is small. What do you decide today that you will not have to undo when the second person who is not you logs into the system?

In the previous post I named four axes (tenant isolation, RBAC, agent secret separation, audit) and a five stage operational ladder. None of the four axes are load bearing at Stage 1. There is no second tenant for isolation to protect. There is no second role for RBAC to gate. There is no second human for secret separation to defend against. There is no second party to answer to for audit. So the data structure question collapses into a different problem, and the problem turns out to be the one sitting on the screen right now. Pick the smallest set of decisions you can make this Friday night that will make the Stage 2 transition cheap, and refuse the rest.

This is a piece about that small set. Six decisions, and a seventh if your stack carries a vector index. Each one is the kind of decision that, made the wrong way tonight, will show up nine months from now as a calendar entry titled "schema downtime" and a Slack message that begins with "so, about Tuesday."

The first decision, which is whether to add a column for a thing that does not yet exist

The cursor is on documents. Four columns. id, body, created_at, embedding. You move the cursor to the line after id and you stop.

The question is whether to add tenant_id. Right now there is one tenant. There is only ever going to be one tenant on Wednesday morning when the agent goes live. The column will hold the same value in every row for as long as you can see. A reasonable person would say the column is not earning its keep, and add it later when the migration is small enough to write in a sitting.

I think the reasonable person is wrong, and I think I can tell you exactly why.

Suppose Wednesday goes well. Suppose by August your second design partner has signed an NDA and asked, in a Zoom you should have been more prepared for, how their data is separated from your first design partner's data. The real answer you give is "let me get back to you on that," and the real cost is the next nine days. A tenant_id column added across every domain table. A backfill that assigns the existing rows to the founding tenant. A change to every query in the data access layer to filter on the new column. A migration window during which the agent has to be paused, because you cannot reason about partial rollouts of a column that the application code conditionally filters on. And then a second conversation with the design partner that begins with "so, about that question on Tuesday."

If the column were already there, populated with a sentinel value the application code already filters on, none of that nine days exists. The Stage 2 migration is one line. Stop accepting the sentinel as a valid value at write time. The schema is unchanged. The query layer is unchanged. The agent does not have to be paused. The conversation ends in a one paragraph email that says "yes, every row carries the tenant identifier, the policy enforcement turns on Friday, here is the migration diff."

So you type tenant_id uuid not null default '00000000-0000-0000-0000-000000000001'::uuid on the line after id. The cost tonight is fourteen characters in the migration file. The cost not paid in August is roughly two weeks of your life. This is the cheapest insurance in the small set, and you buy it before the coffee gets warm again because you got up and microwaved it.

The principle the decision rests on is the one that will run underneath every other decision in this post. At Stage 1 the right architecture is not minimal and not future proof. It is additive. Every choice you make tonight should make the Stage 2 migration a column add, a policy turn on, or a new table next to existing tables. Never a restructuring. If a decision tonight requires the future you to restructure, it failed.

There is a corollary. Every Stage 1 decision is one of three things. A primary key choice. A column you name even though it is currently constant. A deferral you write down so the next migration knows it is there. That is the entire surface area. Six decisions because three of them are columns, two are deferrals dressed as design moves, and one is the primary key itself.

Stage 1 done well is one migration away from Stage 2, not one rewrite away.

The second decision, which is the shape of the primary key

The cursor moves up one line. id. The Postgres default is a bigserial, an auto incrementing integer that ticks up monotonically every time a row gets inserted. Your driver supports it out of the box. The integers are short, debuggable, and they sort the way humans expect them to sort. There are blog posts a decade old that will tell you they are fine.

I think the blog posts are reading the wrong problem. The Stage 1 cost of an integer primary key is zero. The Stage 2 plus cost is two distinct things, and both of them show up on a calendar.

The first cost is collisions. The day your first paying customer asks for an import of their existing data, the IDs in their export will, in some unpredictable subset of cases, collide with the IDs you have already issued. A merge with two existing customers is the same problem, only now the collisions are on rows that have foreign key dependencies you have to chase across nine tables. The second cost is leakage. Every URL that exposes an ID tells the outside world how many records of that kind you have ever created. Your competitor signs up for a free trial in March, signs up again in October, subtracts the two IDs they see in the URL, and now they know your weekly growth rate. Neither cost is theoretical.

The fix at Stage 2 is a primary key migration. You change the column type. You update every foreign key reference. You issue new IDs for every existing row and rewrite every dependent row to point at the new value. You schedule a maintenance window for the events table, because that table will be the largest and the migration will lock it. You write a postmortem afterward, because something will go wrong.

The fix at Stage 1 is to type uuid instead of bigserial. That is the whole fix. The cost is two more characters in the column definition and a slightly larger storage footprint that you will not notice for the next two years.

The follow up question is which UUID. The cleanest answer in 2026 is UUID v7, which was finalized in RFC 9562 in May 2024. v7 IDs are lexicographically sortable, time ordered, opaque to outside observers, and collision safe across replicas and merges. The B tree index behavior is much closer to a monotonic integer than v4's random insertion pattern, which means your index pages do not fragment the way they do under v4 once you cross a few million rows. If your database driver does not yet generate v7 natively, v4 is acceptable and the migration to v7 later is bounded. The decision that matters tonight is UUID, not integer. v4 versus v7 is a follow up choice, made when your driver tells you it is ready.

You write id uuid primary key default gen_random_uuid_v7(). The function name might be wrong for your driver. You make a note. Tomorrow morning you will look it up. What matters tonight is that the column is a UUID, and from Wednesday forward every new row issued by every new table you create on top of this schema will inherit the same posture without you having to think about it again.

The third decision, which is whether to write down what the agent did

The cursor moves down. Below documents. Below the embeddings table you are about to add. You create a new table called events. Six columns. id uuid, tenant_id uuid, actor_kind text, actor_id uuid, action text, payload_jsonb jsonb, created_at timestamptz.

The temptation here is to think bigger than the table needs to be tonight. There is a literature on audit logs. There is a literature on event sourcing. There is a literature on append only stores and hash chains and immutable ledgers, and most of it is correct about the systems it is written for. None of it is correct for the system on your screen tonight, because the system on your screen tonight has one operator, one tenant, and zero auditors.

What the table needs to be tonight is a regular Postgres table that you write to whenever the agent does something meaningful. Meaningful is roughly defined as any tool call that lands a side effect. Retrieval calls do not go in the events table. Model calls do not go in the events table. The drafted output the agent showed you but did not send to a real recipient does not go in the events table. The moment the agent sends an email, books a meeting, writes a row to your domain tables, files a refund, mutates the world outside its own working memory, an event row gets written. That is the rule.

The reason this matters tonight, and not in March or July, is that the Stage 4 audit table is this table. When the first regulated customer signs at Stage 4 and asks for a hash chained, append only, non repudiable log of every action your agent has taken on their behalf, the work that turns this Stage 1 table into that Stage 4 table is additive. A hash column. An append only constraint. The actor_kind text column tightened into a typed enum that names agent as a first class actor distinct from user. Same table. The rows written between this Friday and that Stage 4 conversation are what give the audit story credibility. Three months of organic write history keeps an auditor's eyebrows where they belong. An empty table, populated for the first time the week of the audit, is what makes them lift.

If you do not start the table tonight, the Stage 4 work is not additive. It is the table itself, which means a backfill, which means a reconstruction effort that pulls from log files written by services you may no longer be running. That kind of work produces a presentation to the security committee titled "what we are doing to improve our audit posture," and the security committee has heard that presentation before.

You write the migration. You add one line to the agent's main action loop that inserts into events whenever the action completes. You set actor_kind to the literal string agent for now. You set actor_id to the same sentinel value you set tenant_id to. You will tighten both at Stage 4. The table works. The first ten rows show up while you are still finishing the migration, because the agent is running in the other terminal and your co founder kicked off a test. You take a screenshot of the first ten rows and send it to your co founder with no caption, because the screenshot is the entire point.

A short detour into the failure mode that lives on the other side of this Friday

Before the next decision, I want to put a sibling scene on the table. Around the time you finish the third decision, the question of whether to keep going or to stop will start to feel pressing, and the wrong answer to it is a failure mode I want you to be inoculated against.

Imagine a different Friday night in a different apartment. Two founders in Brooklyn. The technical co founder has read the previous post and this one, plus three other posts and two threads on Hacker News, and she has internalized the framing that getting Stage 2 right is the difference between a clean Stage 3 and a chaotic one. She is right about that. What she does next is the part that costs her.

She decides to build for Stage 2 now. The first week goes to a tenants table with seven columns, because somewhere on the internet she read that real multi tenant systems have at least seven columns. Display name, slug, plan, billing entity ID, created at, updated at, status. The second week goes to a roles table. Admin, member, viewer, billing. The third week goes to a permissions framework that joins users to tenants to roles to capabilities, with a check function the data access layer calls before every read and write. The fourth week goes to row level security policies that mirror the application layer checks. The fifth week goes to a secrets dereference layer that fronts every tool call, because she has read the Stage 4 chapter as well, and she wants to be ahead of it.

Ten weeks pass. The agent has been running the whole time against a synthetic test set. The product, narrowly defined, has not advanced. The data architecture, broadly defined, is a thing of beauty. They demo it to an advisor in week nine. The advisor says it looks great.

In week eleven, the first real customer signs. The customer is not who they expected. A marketing operations team at a Series B SaaS company, who uses the word "workspace" to mean "team account that contains people, integrations, and campaign artifacts." It does not use the word "tenant" at all. The first question, asked in the kickoff call, is whether the system can support a single user belonging to two workspaces with different roles in each one. The answer is no. The tenants table assumes a user belongs to one tenant. The roles table assumes a role is global per user, not per workspace. The permissions framework has to be rewritten. The policies have to be rewritten on top of the rewritten framework. The secrets dereference layer holds up, but it was also the cheapest of the five things, and the wrong week was spent on it.

What happened in Brooklyn was not a failure of intelligence or of taste. The technical co founder is a strong engineer. What happened was that she designed for a fiction. A tenant model derived from architecture posts she had read could not survive contact with a real customer whose mental model of how teams should work had been shaped by ten years of using Slack and Asana and Linear. She could not have known. The only way to know was to put the product in front of the customer, and the customer did not exist for ten weeks because the architecture was being built for them.

The cost of the ten weeks is not the ten weeks. The cost is that the schema now has to be rewritten while a real customer is asking questions about it. The Brooklyn team will spend the next three weeks rebuilding architecture they have already built once, while fielding questions from a customer who can tell something is wrong without being able to say what. The customer will sign anyway. The customer will also be the first reference call, and the reference call will go differently than it would have gone if the schema had survived.

The Stage 1 decisions on your screen tonight do not get you to Stage 2 alone. They get you to the conversation with the customer who will define what Stage 2 actually means for your product, and the conversation has to happen before the architecture solidifies, not after. The small set is the smallest set of decisions that makes that conversation cheap to act on. The Brooklyn team picked the small set and kept going, and they paid for the next ten weeks twice.

Back to your apartment.

The fourth decision, which is whether to enforce or to rehearse

It is now ten past eleven. The events table is writing rows. The cursor is back in the data access layer.

You have a choice about where to put the tenant filter. The principled answer is row level security in Postgres, which lets the database enforce the policy regardless of what the application code does. The pragmatic answer is to put the filter in the application code itself, in a function that every query passes through, and to leave row level security off until the migration to Stage 2.

I think the pragmatic answer is the right one at Stage 1, and the reasoning is specific enough that I want to walk it.

Row level security at Stage 1 has no enforcement target. There is one tenant. The policy, if you wrote one tonight, would be "the tenant_id on the row equals the tenant_id in the session," and the tenant_id in the session is always the same sentinel value. The policy does no work. It does, however, do harm. Every query that returns the wrong number of rows in development now has to be debugged through the lens of "is this a policy failure or a query failure," which doubles the diagnostic surface for a class of problems that has no real enforcement value to compensate for it. Some queries will hit edge cases where the policy and the planner interact in surprising ways, and you will spend a Wednesday tracking down a slow query that turns out to be a policy expression being re evaluated per row. You will pay this debugging tax for the entire Stage 1 period, on behalf of an enforcement layer that is not enforcing anything yet.

The application layer filter, in contrast, costs nothing tonight and earns its keep starting now. You write a data access function (call it q) that takes a tenant_id, a query, and parameters, and runs the query with the tenant_id stitched into the where clause. Every read in the application goes through q. Every write goes through q. There are no exceptions. The application code develops the muscle memory of always carrying the tenant_id through the query path, even though every value is currently the sentinel, even though the muscle memory does no enforcement work tonight.

At the Stage 2 transition the row level security policy gets turned on as defense in depth. The application layer filter does not go away. The two filters now coexist, and the row level security policy is the safety net for the case where someone in three years writes a query that bypasses q for reasons that seemed good at the time. The query patterns in the application code do not change at the Stage 2 transition, because they have been filtering on tenant_id all along. The Postgres row level security docs are the right reading when you get to that transition. Tonight they are reference material for a layer you are deliberately not turning on yet.

The shape of the principle here will repeat in the sixth decision. You design tonight for the discipline. You enforce later when there is something to enforce against. The discipline is in the function signature, in the column existence, in the muscle memory of the application code. The enforcement is the work that lights up when a second party arrives whose interests are not aligned with the first. There is no second party tonight. The discipline is what you ship.

A second detour, into the symmetric failure mode

I owe you the under build before the next decision lands. The moves I have asked you to make so far have been mostly additive, which means the failure mode of doing too little is harder to see on a Friday night when the deadline is Wednesday. It is the one I have made myself.

Imagine a solo founder in Lisbon. He ships an agent in three days from a coffee shop near the river. Two tables. users and documents. UUIDs felt too long, so primary keys are auto increment integers. There is no tenant_id anywhere, because he is the only user and adding the column felt like premature abstraction. There is no events table, because he is watching the agent run in a terminal next to the chat window and he can see what it does in real time. The product works. He posts about it. Two design partners reach out. He pitches both, and one of them, a small consultancy in Madrid, signs an NDA in the second week of the second month and asks if they can start putting their client data into the system.

He says yes. They start. Three weeks later, the second design partner signs, and the second design partner is a competitor of the first. The Madrid consultancy finds out through a casual conversation at a conference, not through any leakage in the system, and they ask by email whether their data is separated from the competitor's. The real answer he gives is "let me get back to you on that," which puts him in the position the opening scene was designed to avoid.

The next four weeks happen. He adds tenant_id to nine tables, because the schema has grown since the first three days. He writes the backfill. He hits the primary key collision problem the moment he tries to script the import of the second design partner's existing data, because the IDs the consultancy uses internally overlap with the IDs his system has already issued. He writes a second migration that maps the colliding IDs to new ones, which means he also has to rewrite the foreign keys across three tables that reference each other. He pauses the agent for forty minutes during the cutover. He sends an email to both design partners that begins with "as part of preparing the system for multi tenant operation," which is the kind of email that is true and also tells the reader something has been wrong until now.

The Lisbon founder did nothing technically wrong at Stage 1. He shipped fast. He validated the product. He found two design partners in two months, which is faster than most founders find one. The cost is that the four weeks of migration work happened while the agent was already live for paying interest, and that the email he sent on the Friday afternoon of the cutover is the kind of email that costs trust in a way that is hard to measure but easy to feel. The technical work was bounded and finite. The trust cost is not.

His mistake was the symmetric mirror of the Brooklyn team's. He picked too few of the small set and shipped. They picked too many and never shipped. Both mistakes resolve to the same answer, which is the one you are halfway through implementing tonight. Pick the small set. Stop.

Back to your apartment, where the lamp is now the only light source because the sun went down two hours ago and you forgot to turn on the overhead.

The fifth and sixth decisions, which are about not building things you might otherwise build

The fifth decision is the easiest one to get wrong, because it is a decision not to do something the rest of the internet is telling you to do.

You use one database. Postgres. That is the whole decision. You do not stitch together Postgres plus a vector store plus a graph database plus a queue plus a key value cache because the vendor blogs say modern agent stacks have all five. Stage 1 needs one database. The vector concerns ride on a pgvector column inside Postgres until measured pressure says otherwise. The queue concerns ride on a Postgres table with a claimed_at column and a worker that polls it, until measured pressure says otherwise. The cache concerns are deferred entirely, because there is no traffic to cache against yet.

The Stage 2 payoff of one database is mechanical and slightly dull, which is appropriate to the kind of decision it is. Backups are one job that runs against one connection. Migrations are one tool against one schema. Transactional boundaries are real, which means the events table write and the domain table write can happen in the same transaction without an outbox pattern wedged between them. Tenant scoped queries all live in one place, which means the data access layer is one function and not five. When the first design partner asks how data is isolated, the answer is one connection and one schema, not a topology you have to draw on a whiteboard.

The temptation against the fifth decision is strongest when you are also reading the vendor literature for the dedicated vector stores. The literature is correct that pgvector has scaling ceilings and that a dedicated store will outperform it at large index sizes. It is wrong, in your case, about that workload existing on a Friday night before you have shipped to a second person. The move to a dedicated store is a Stage 3 plus decision that earns its keep when the existing database is the named bottleneck on a named retrieval flow. Tonight it would be a topology committed to before you knew what your queries looked like. You leave it.

The sixth decision is the one that is hardest to remember because it is about the shape of a function signature and not the shape of a schema.

Every tool the agent can call should accept a client supplied request ID in its input schema. The implementation does not need to look at it. The discipline is that the field is on the input, and that the agent generates one for every tool call. The reason is the same as the tenant_id column. Adding the field to the signature later is a coordination problem across every caller, which includes the agent's planner, the retry logic, the test harness, and any human interfaces you have wired up. Adding the field tonight is one entry in the tool's JSON schema.

At Stage 3, when paying customers exist and the cost of a duplicate refund is real, the enforcement layer reads the ID and rejects duplicate requests within a window. The pattern is described well in Stripe's writeup on idempotency keys, which is the cleanest treatment of why the request ID belongs on the request rather than inferred from a hash of the payload. The Stage 1 move is that the field is on the request now, so the enforcement layer in 2027 does not have to ask the planner for a field the planner does not generate.

The shape of both decisions is the shape of the deferral, which is the third thing in the surface area. Some Stage 1 decisions are not what you build. They are what you write down so the next migration knows it is there. The single database is a deferral of the polyglot stack. The request ID is a deferral of the idempotency enforcement. Both deferrals cost almost nothing tonight, and both make the Stage 2 plus migration into a thing you do on top of a foundation rather than a thing you do to a foundation that is being moved.

The small set, on a single screen

You step back. You make a coffee that you intend to drink this time. The schema on the Figma board now has six small additions in pencil colors that match the rest. You take a screenshot of it and look at the six decisions as a set.

The six Stage 1 decisions, at a glance Six compact cards arranged in a three by two grid. Each card carries a numeral indicator, the name of the decision, and the Stage 2 payoff in one short line. The six decisions, at a glance Each card names the decision and the Stage 2 payoff. Detail in the sections below. 01 One database not many One connection, one schema, one place to back up. 02 tenant_id on every domain table Sentinel default today, no backfill at Stage 2. 03 UUID v7 primary keys No PK migration when tenants merge or replicate. 04 An events table from day one Stage 4 audit is additive against real write history. 05 App layer tenant filtering, no RLS yet Query patterns do not change when RLS arrives at Stage 2. 06 Re-runnable tool calls, no enforcement yet Request ID on the signature, enforcement waits for Stage 3. An optional seventh applies if you ship a vector index: tenant_id on the index metadata, same posture as 02.

The six cards contain the discipline of what you wrote tonight and the Stage 2 payoff each piece of the discipline buys. They fit on a single screen in a font someone in a meeting two years from now can read at a glance. The next architect of this system, whether that is a future version of you or a senior engineer you have not yet hired, will read them first.

The optional seventh, which applies if you are shipping a vector index

If you are shipping a vector index, and you are, because the embedding column has been part of the schema since the first weekend, the seventh decision is to write tenant_id into the index metadata on every upsert.

If the index is a pgvector column on the same Postgres, the tenant_id column is already there from the second decision and you have nothing further to do tonight. The retrieval policy at Stage 2 is a where clause filter on tenant_id, the same as every other query. If the index is a dedicated store (Pinecone, Weaviate, Qdrant), the metadata schema for every vector gets a tenant_id field populated with the sentinel value, the same posture as the second decision. Every upsert from now writes the field. Every retrieval from Stage 2 forward filters on it. The Stage 2 work is the same as the second decision's Stage 2 work, which is the change from accepting the sentinel as valid to rejecting it. No re indexing. No metadata backfill.

The seventh decision is the optional one because not every team will ship a vector index at Stage 1, but most teams reading this post will, and the cost of getting the metadata wrong at the vector layer is a re indexing job that takes hours and that you cannot do while the agent is serving live retrieval. The metadata write tonight costs no marginal effort. The metadata write later costs an outage.

Stage 1 to Stage 2 schema diff, additive only Two stacked panels showing the same set of tables. Stage 1 has tenant_id as a sentinel constant. Stage 2 keeps every column and adds row level policies that read the tenant_id that was already there. Stage 1 to Stage 2 is additive, not a rewrite The columns you wrote at Stage 1 stay. The Stage 2 work is turning on the policy that reads them. STAGE 1 documents id uuid v7 tenant_id uuid default 'sentinel' body text, created_at timestamptz events id, tenant_id, actor_kind, actor_id action, payload_jsonb, created_at no hash chain yet, no append only constraint STAGE 2 documents id uuid v7 tenant_id uuid (policy ON) body text, created_at timestamptz events id, tenant_id, actor_kind, actor_id action, payload_jsonb, created_at + row level policy on tenant_id Diff: same columns, same rows. The policy now reads a column that was already there.

The diff above is the artifact I want you to be able to draw at the Stage 2 transition without consulting a notebook. Same columns. Same rows. The policy now reads a column that was already there. The work between the two panels is not a schema change, because the schema is already in the right shape. The work is turning on the policy that has been waiting for a tenant to enforce against.

The list of things you do not build, and why each one is waiting

The harder discipline than the small set is the symmetric list. Ten things you do not build tonight, each named with the trigger that will eventually promote it from a deferral into a real piece of work. The list is the discipline of refusing to do anything that is not on the small set, and the trigger column is the discipline of being able to say, in a sentence, why the absence is the right answer at Stage 1 and what specifically lights it up later.

I think about the list in two thematic groups. The first group is the access control and audit primitives, which all light up between Stage 2 and Stage 4 and which are the bulk of the work you are tempted to do tonight because you have read the literature. The second group is the scale and reliability infrastructure, which lights up between Stage 3 and Stage 5 and which is the bulk of the work the vendor blogs are trying to sell you because their products are best in class at the relevant scale.

When each Stage 1 deferral lights up A timeline with ten rows, one per deferred item, mapped across the five operational maturity stages. Each row has a single highlighted cell at the stage where the item becomes load bearing. Dormant cells are empty placeholders. When each Stage 1 deferral lights up Each row is a thing you did not build. The lit cell is the stage where you will, with the trigger that promotes it. dormant lights up required at this stage 02 Co-builders first NDA'd operator 03 Paying contracts signed 04 Regulated SOC2 / HIPAA ask lands 05 Multi-agent concurrent at consequence 05+ Almost never documented requirement Row level security policies lights up Roles table, RBAC lights up Idempotency key enforcement lights up Outbox pattern lights up Materialized views, replicas, cache lights up Agent secret separation lights up Hash chained audit, actor_kind lights up GDPR tombstones, append only lights up Multi region, failover, sharding lights up Event sourcing if ever Stage 1 sits to the left of every column above. Nothing on this list earns its keep until the trigger arrives. Event sourcing is the exception: most teams that ship it pay the operational tax without ever needing the temporal query.

The heatmap above is the visual artifact of the list, and it is doing most of the work of the next two sections. Every cell to the left of the lit cell is a season of your life that you can spend on something other than the thing in that row.

In the first group, the access control and audit primitives are five items, and each one is dormant tonight for a specific reason. Row level security has no enforcement target because there is one tenant, and it lights up at the Stage 2 boundary when the second tenant exists. The RBAC roles table has no role distinctions to model, and it lights up at Stage 3 when the first paying customer has a second human inside their account who needs a different view of the same data. The agent secret separation layer has no operator separation to defend tonight, because the secrets in the env file were written by you and the agent that reads them is also operated by you. It lights up at Stage 4 when the first regulated customer signs and the auditor asks where the secrets surface bounds the agent's blast radius. Idempotency key enforcement has no duplicate to reject because the only consequence of a duplicate is a wasted call, and it lights up at Stage 3 when the cost of a duplicate refund is real money. Hash chained audit, the actor_kind enum, and the append only constraint on the events table have no auditor to satisfy, and they light up at Stage 4 against the three months of organic write history the events table has been collecting since this Friday.

In the second group, the scale and reliability infrastructure is five more items, all of them deeper into the stage progression. The outbox pattern, which Chris Richardson's canonical writeup is the right reading for when you get there, is required at Stage 3 when the LLM call and the database write start needing to be one logical operation. It is deferred tonight because there is no downstream consequence to a dropped side effect when the only operator is watching the agent run in the other terminal. Materialized views, read replicas, semantic caches, and second vector stores all chase measured pressure that does not exist at Stage 1, and they light up at Stage 3 when the workload has been measured against a query plan that the existing setup cannot serve. GDPR tombstones, audit specific read replicas, and append only enforcement light up at Stage 4 against the same regulatory ask that lights up the audit hash chain. Multi region replication, failover topology, and sharding light up at Stage 5 or later. Event sourcing sits to the right of the rest on the chart, because most teams that ship it pay the operational tax of replay, projection rebuilds, and schema evolution across the event stream without ever needing the temporal query that justified the choice.

None of these are wrong at Stage 1. They are early. Early is a real cost, and it is the cost the Brooklyn team paid for ten weeks. Refusing it tonight is the active choice that the small set requires. The deferral is not a failure to decide. The deferral is the decision.

The small set is six decisions. Everything else is early, not just unfinished.

Sunday night, with the small set complete

It is now Sunday. The apartment is quieter than it was on Friday because your co founder is out for dinner. The schema is finished. The migrations have been written, run against a fresh database, rolled back, and run again. The events table has four hundred and twelve rows from the weekend's test runs. The tenant_id sentinel value appears on every row of every domain table. The vector index has the metadata field populated. The data access function q is the only entry point to the database, verified by a grep that you ran twice because you did not believe it the first time. The agent's tool calls all accept a request ID, generated by the agent and ignored by the implementations. You will ship Wednesday.

The schema diff on the second monitor is the artifact above. Stage 1 on the left. Stage 2 on the right. Same columns. Same rows. The only thing the Stage 2 panel adds is a row level policy that reads a column the Stage 1 panel already has.

The thing that has happened over the weekend is that the future migration is now a known piece of work. Not a guess. Not a thing you are choosing to worry about later. A piece of work whose shape you can describe in a sentence to your co founder over coffee tomorrow morning. The shape is "one migration that turns on the policy, one configuration change that issues real tenant identifiers instead of the sentinel, and one update that rejects the sentinel at write time." That is the Stage 2 transition. Three changes. Done in a sitting. The schema is unchanged.

The frame the whole post has been building toward lands here. The right Stage 1 architecture is not minimal and not future proof. It is additive. The small set is what additive looks like. The deferrals are what additive requires. The Stage 2 transition is one migration, not a rewrite, because the columns are in the right place and the policies are the only thing missing.

The next architect of this system, whoever that turns out to be, will open the codebase and the shape will read as deliberate, because it was. The deferrals will read as decisions, because they were. The Stage 2 transition will be cheap, because the Friday night was honest.

Diagnostic for your Monday

Tomorrow morning is Monday. The room will be different. The coffee will be hot. The agent will have been live in your hands for the entire weekend, and you will have something specific to do.

There are four things, and they are deliberately small. The posture is to do them this week and then stop. Doing more than the small set is the Brooklyn mistake. Doing less is the Lisbon mistake.

The first thing is to add a tenant_id column to every domain table that does not have one, with a sentinel default. One migration. Done by lunch on Tuesday in the quiet thirty minutes between the morning standup and the first meeting that wants your attention. The migration should be small enough to read out loud. If it is not, you have more domain tables than you thought.

The second thing is to switch primary key generation to UUID v7 for any new table from now on. If your existing tables are on integer primary keys and there is no real data on them yet, migrate now while the migration is cheap. If there is data, write a note for the Stage 2 migration that says "primary key migration first, schema changes second." The deferral is the decision.

The third thing is to create the events table with the six columns above and add one write to the agent's main action loop. The first ten rows are the proof that the table works. Take a screenshot when they appear. Send it to your co founder. The screenshot is the artifact that means the table exists, and the table existing is the entire Stage 1 audit story.

The fourth thing is to count the tables that would need a tenant_id backfill if you started this afternoon, before you do any of the other three. That number is your current Stage 1 debt. Write it on a sticky note and put it on your monitor. The goal is zero by Friday. Each migration decrements the number by one. The note coming down on Friday afternoon is the small ceremony that closes the week.

A note for regulated readers

If you are building for regulated customers from day one (HIPAA, PCI, GLBA, SOC 2 audit on the horizon), Stage 1 is upstream of your real ask, and most of this post sits beneath the floor of what your customer's first contract will require. Skip ahead to the Stage 4 chapter when it publishes. Nothing in this post is legal or compliance advice; the regulatory references are for orientation, not interpretation.

Closing

It is late Sunday now. The lamp is still the warm one. The agent is running quietly in the other terminal, writing its third event of the evening to the table that did not exist forty eight hours ago. The schema on the Figma board has not changed since Friday afternoon, because the changes you made over the weekend were small enough to belong to the same drawing. Wednesday is in three days. The product is the same product it was on Friday. The architecture is now the kind that does not have to be apologized for in the second design partner conversation. That is the entire Stage 1 prize, and it is the only one that matters.

If you are at Stage 1 and any of this resonates, I would welcome a conversation. The contact form at the top of the site goes to my inbox.

Up next: Stage 2, when a second operator changes everything.

Frequently Asked Questions

Do I really need `tenant_id` if I only have one tenant?

Yes. The cost today is one column with a default value. The cost later is rewriting the data access layer to thread tenant context through every query, plus backfilling the column across every existing row. The Stage 2 column add is the cheapest insurance in the small set, and it pays for itself the first time someone outside the founding team logs into the system.

Why UUID v7 specifically, not v4 or auto increment?

v4 indexes poorly because random inserts fragment B trees over time, which becomes a real cost once tables grow past a few million rows. Auto increment integers collide on merges and replica failovers, and they leak record counts publicly in any URL that exposes the ID. v7 is lexicographically sortable, time ordered, opaque, and collision safe. <a href="https://www.rfc-editor.org/rfc/rfc9562" target="_blank" rel="noopener noreferrer">RFC 9562</a> is the canonical reference if you want the bit layout. If your driver does not generate v7 natively yet, v4 is acceptable. The decision that matters is "UUID, not integer."

Is it OK to use SQLite at Stage 1?

For an early prototype before you ship anything to a user, yes. Past that point, no. SQLite does not support row level security, has weaker concurrency semantics under write contention, and the Stage 2 migration to Postgres is not free. The schema decisions in this post all assume Postgres is the destination, and if the agent is going to ship to a person who is not you, start on Postgres.

When should I add a roles table?

When the first paying customer has a second human inside their account who needs a different view of the same data. Not before. Adding a roles table before then is designing a permission model you have not validated against a real product surface, and the permission model you ship will not be the one your first customer needs. Wait for the customer.

Should I use Supabase, Neon, PlanetScale, or Turso?

Any managed Postgres works. The decisions in this post are schema decisions, not vendor decisions. Pick a vendor that runs Postgres, that exposes row level security when you need it at Stage 2, and that you can leave without rewriting your application. The differences between the managed providers matter at Stage 3 plus, when read replicas and connection pooling shape your operational story. At Stage 1 the differences are noise.

What about pgvector versus a dedicated vector DB at Stage 1?

pgvector on the OLTP Postgres is the cheaper default until measured pressure says otherwise. Same posture as the previous post in this series. The Stage 1 question is not "which vector store" but "is the <code>tenant_id</code> column on the vector index metadata," and the answer is yes regardless of which store you pick.

My co founder uses the system. Are we still Stage 1?

Yes. Stage 1 ends when someone outside the founding team gets credentials, even an NDA'd design partner. Two operators with the same risk surface and the same access to the secrets file is still one operating posture. The Stage 2 transition is the first credential issued to a person who would not have access to your AWS console.

Code Atelier · NYC

Ready to get agent-ready before your competitors do?

Let's talk