Architecture6 min read

Designing multi-tenant SaaS safely

Workspace isolation is the feature you never want to get wrong. Notes on tenant boundaries, RBAC, and designing data access you can actually trust.

Multi-tenancy sounds like a column. You add a workspace_id, filter your queries by it, and you are done. That framing is how isolation bugs ship. In Agentic CRM I treated workspace isolation as an invariant instead of a column, something that has to hold across every query, join, and background job, not only the ones you remember to filter.

Why you cannot get it wrong

Most bugs are recoverable. A single row leaking from one tenant into another is not. It is the kind of failure that ends a customer relationship the moment it is noticed. That asymmetry should shape the design. Over-building isolation costs you some extra structure. Under-building it costs you everything, with no way to take it back. So isolation gets to be the thing the rest of the system is organized around.

Enforce at the boundary

The fragile pattern relies on every developer, on every query, forever, to remember the workspace filter. People forget. New endpoints appear. A join pulls in a related table and quietly widens the blast radius. The durable pattern makes workspace scope a property of the data-access layer itself, so an unscoped query is not something you can express by accident. The boundary holds because the system enforces it, not because everyone stayed careful.

Async work escapes boundaries too

The easiest boundary to forget is the one inside your workers. Inbox sync and AI analysis run off the request path, where a request-scoped "current workspace" does not exist. If a job does not carry its workspace context, async work can cross a tenant line that the synchronous code would have caught. So jobs carry workspace context in their payload, and the same scoping rules apply inside the worker as inside a request handler.

RBAC is a separate question

It helps to keep two ideas apart. Tenant isolation answers which workspace a row belongs to. RBAC answers what a member is allowed to do inside their workspace. They are different boundaries with different failure modes, and collapsing them leads to muddled code. In Agentic CRM, isolation is the hard outer wall that no record crosses, and RBAC is the finer control over what an authenticated member can do within it.

How I check it

Treat the cross-tenant access attempt as a first-class test. It has to fail everywhere, including in async paths.
Make the scoped path the easy path, so the safe thing to write is also the convenient thing to write.
Assume new code is unscoped until proven otherwise. The invariant is what you defend, not the average case.

None of this is glamorous, which is the point. Workspace isolation is invisible when it works and fatal when it does not. Treating it as an enforced invariant rather than a column you filter is the difference between SaaS you can trust with someone else's data and SaaS you cannot.

Keep reading