DCCAPTN: a modified CAP theorem for distributed containers

A theorem I have been developing — DCCAPTN, "Dee See Captain" — adapts Brewer's CAP theorem to the reality of distributed container ecosystems: Consistency, Availability, Performance, and Tolerance of Network Failures as four dials wired to each other, not a menu you fully satisfy.

I have spent a long time building the thing that runs other people's containers. You learn a particular lesson doing that, over and over, until it stops being a lesson and starts being a law: the trade-offs you refuse to name up front will name themselves later, in production, at the worst possible moment.

The cleanest articulation of that idea we have as an industry is Eric Brewer's CAP theorem — a distributed data store can give you at most two of Consistency, Availability, and Partition tolerance at once. It is a beautiful, durable result. But it was written about data stores, and the world most of us operate in now is not a data store. It is a fleet of orchestrated services, rolling deployments, meshes of dependencies, and the network fabric that ties them together. CAP still rhymes with that world, but it does not quite fit it.

So I have been developing an adaptation. I call it DCCAPTN — pronounced Dee See Captain — and this essay is the first time I am setting it down in full.

What the acronym stands for

D — Distributed
C — Container
C — Consistency
A — Availability
P — Performance
TN — Tolerance of Network Failures

DCCAPTN keeps CAP's Consistency and Availability, because those two guarantees are exactly as load-bearing in a container ecosystem as they are in a database. It reframes Partition tolerance as the broader Tolerance of Network Failures, because in a container world the thing that fails is rarely a clean network partition — it is a saturated upstream, an unreachable third party, a service that answers slowly instead of not at all. And it promotes Performance to a first-class variable, because in a container ecosystem performance is not a side quest. It is the currency the other guarantees are bought with.

The four variables

Consistency

Every request may not be served by the same version of the source code — especially where a rolling deployment strategy is employed. Two identical requests, issued a second apart, can legitimately land on two different versions of your service and come back with two different answers.

Availability

Every request received by a non-failing service must result in a response. If the service is healthy, silence is a bug.

Performance

Both availability and consistency can come at the cost of performance. The thresholds have to be acknowledged explicitly: response timeouts, upstream and downstream timeouts, and how long connections are allowed to persist.

Tolerance of Network Failures

The failure of one system — inside or outside the container ecosystem — must not lead to the failure of any other system in the ecosystem. A failure is allowed to happen. It is not allowed to spread.

The theorem

Here is the assertion, stated plainly.

In a distributed container ecosystem you cannot optimise Consistency, Availability, Performance, and Tolerance of Network Failures all at once. They form a system of trade-offs: pushing one guarantee higher will generally pull at least one of the others lower. DCCAPTN is a framework for making those trade-offs deliberately rather than discovering them in production.

It rests on four claims.

1. Consistency and Availability are in tension, and both are paid for in Performance. Guaranteeing that every request sees the newest version of the code (Consistency), or that every request to a healthy service returns (Availability), requires coordination — version pinning, replica synchronisation, retries, health checks, connection draining. That coordination costs time, which is why Performance is impacted by both. In DCCAPTN, Performance is not a fourth independent goal; it is the currency the other two guarantees are bought with.

2. Rolling deployments make Consistency probabilistic, not absolute. When a service is mid-rollout, different replicas run different versions of the source code simultaneously. Two identical requests can legitimately hit two different versions and receive two different answers. Consistency in a container ecosystem therefore means managing and bounding version skew — compatible APIs, backward-compatible schemas, graceful cutover — not eliminating it.

3. Tolerance of Network Failures is about isolation, not just uptime. The failure of any one system — a crashed instance, a saturated upstream, an unreachable third-party dependency — must be contained. It must not cascade into the failure of unrelated systems. This is the container-ecosystem restatement of partition tolerance: the network and its participants will fail, and the ecosystem must degrade locally rather than globally, through circuit breakers, bulkheads, timeouts, and fallbacks.

4. Performance and Tolerance of Network Failures do not directly affect each other. This is the pivotal — and perhaps counter-intuitive — claim of the model, and it is what separates the two axes of the diagram. A fast system is not inherently more fault-isolated, and a well-isolated system is not inherently faster. Latency budgets and fault containment are engineered independently. You can have a slow-but-well-isolated ecosystem, or a fast-but-fragile one. They meet only indirectly, through Consistency and Availability, which both sit between them.

Put concisely: Consistency and Availability are the shared middle ground; Performance is what they cost; Tolerance of Network Failures is the boundary that keeps a local failure from becoming a global one — and it is orthogonal to Performance.

At a glance

The table below maps how each variable relates to the others — whether it impacts other variables, is impacted by them, and which specific variables (C, A, P, TN) it interacts with. A dash (—) marks a relationship that is either not applicable (a variable to itself) or intentionally absent.

Variable	Impacts	Impacted by	C	A	P	TN
Consistency (C)	✓	✓	—	✓	✓	✓
Availability (A)	✓	✓	✓	—	✓	✓
Performance (P) — impacts	✓	—	✓	✓	—	—
Performance (P) — impacted by	—	✓	✓	✓	—	—
Tolerance of Network Failures (TN)	✓	—	✓	✓	—	—

It is fundamental to understand that Performance does not impact Tolerance of Network Failures, and vice versa.

Reading the table:

Consistency and Availability both impact and are impacted by nearly everything — they are the interconnected core of the model.
Performance is impacted by Consistency and Availability (the coordination they require slows things down) but sits outside the Tolerance-of-Network-Failures relationship entirely.
Tolerance of Network Failures impacts Consistency and Availability (a contained failure still changes what is consistent and what is available), but neither impacts nor is impacted by Performance.

As a picture

I find the model easiest to hold in the head as four overlapping circles on a vertical axis.

DCCAPTN as four overlapping circles: Tolerance of Network Failures at the top, Consistency on the left, Availability on the right, and Performance at the bottom. Consistency and Availability overlap in a shared trade-off zone at the centre. Performance and Tolerance of Network Failures sit at opposite poles and never overlap.

Tolerance of Network Failures sits at the top.
Performance sits at the bottom.
Consistency (left) and Availability (right) sit in the middle, overlapping each other and overlapping both the top and bottom circles.

The arrangement is deliberate. Consistency and Availability occupy the centre because they are the guarantees every request negotiates directly, and they overlap in a shared region — the trade-off zone where tuning one moves the other.

Tolerance of Network Failures and Performance are placed at opposite poles, each overlapping only with the two central circles and never with each other. That visual gap is the whole point: there is no direct edge between the top circle and the bottom circle. Performance and fault tolerance influence the system only through Consistency and Availability.

Why I bother

If you take one idea away from DCCAPTN, let it be this: in a container ecosystem, your four goals are not a menu you fully satisfy — they are dials wired to each other.

When you turn up Consistency — insisting every request sees the same version during a rolling deploy — you spend Performance on coordination and cutover, and you lean on Availability by taking replicas in and out of rotation. When you turn up Availability — insisting every healthy service always answers — you may serve slightly stale versions (relaxing Consistency) and again spend Performance on retries and health-checking.

Performance is the bill for both. It is not a separate ambition you pursue on the side; it is the measurable cost of the guarantees you demand, expressed in timeouts, connection persistence, and upstream and downstream latency budgets. Treat it as a budget, not a bonus.

Tolerance of Network Failures is the guarantee that keeps the other three honest. Networks partition, dependencies time out, instances die — this is not an edge case, it is the normal weather of distributed systems. The discipline it demands is containment: one failure must stay one failure. And crucially, you engineer it independently of Performance. A blazing-fast ecosystem with no bulkheads is one bad dependency away from a total outage; a slow ecosystem with clean isolation will limp but survive. Speed and survivability are different problems — solve them separately.

So when you design or operate a distributed container ecosystem, ask the four DCCAPTN questions on purpose, up front:

Consistency — during a rollout, how much version skew can my callers tolerate, and how do I bound it?
Availability — what does a healthy service owe every request, even under load?
Performance — what latency and timeout budget am I willing to spend to pay for the above?
Tolerance of Network Failures — when (not if) something fails, what stops that failure from spreading?

DCCAPTN does not remove the trade-offs — nothing can. It makes them visible, so you choose them instead of inheriting them. That is the whole ambition of the thing, and, if I am honest, it is the whole ambition of the platform I have spent these years building: to turn the trade-offs of running distributed containers from something you discover into something you decide.

— Meezaan-ud-Din Abdu Dhil-Jalali Wal-Ikram, founder of Bahriya