# CAP Theorem
In any distributed system, network partitions will happen. When they do, you choose between:
The CAP Triangle:
Consistency
/\
/ \
/ CP \
/______\
/\ /\
/ \ ?? / \
/ AP \ / CA \ ← CA doesn't exist in practice
/______\/______\ (no network = no distribution)
Availability Partition
Tolerance
CP Systems (Consistent + Partition Tolerant):
HBase, MongoDB (default), etcd, ZooKeeper, Spanner
→ Reject requests during partitions to stay consistent
AP Systems (Available + Partition Tolerant):
Cassandra, DynamoDB, CouchDB, DNS
→ Serve (possibly stale) responses during partitions
Real systems are a spectrum, not binary.
Many offer tunable consistency (e.g., Cassandra: ONE, QUORUM, ALL).# Consensus
How do distributed nodes agree on something? Consensus algorithms solve this: given N nodes where some may fail, agree on a single value.
Raft consensus (simplified):
┌──────────┐
│ Leader │ ← Accepts all writes, replicates to followers
└─────┬────┘
┌───┼────────────┐
▼ ▼ ▼
┌──────┐ ┌──────┐ ┌──────┐
│Follow│ │Follow│ │Follow│ ← Replicate leader's log
│ er │ │ er │ │ er │
└──────┘ └──────┘ └──────┘
Write committed when majority (3 of 5) acknowledge.
→ Tolerates 2 node failures (N=5, F=2, need N-F=3)
→ If leader fails, followers elect a new one# Message Queues
Message queues decouple services, buffer traffic spikes, and enable asynchronous processing.
Synchronous (coupled):
Order API ──HTTP──▶ Payment Service ──HTTP──▶ Email Service
(if Payment is slow, Order API is slow too)
Asynchronous (decoupled):
Order API ──▶ [ Queue ] ──▶ Payment Worker
[ Queue ] ──▶ Email Worker
(Order API responds immediately, work happens later)
Patterns:
Point-to-point: 1 producer → queue → 1 consumer (task queue)
Pub/Sub: 1 producer → topic → N consumers (events)
Fan-out: 1 message consumed by all subscriber groups# Eventual Consistency
In AP systems, after a write, replicas eventually converge. "Eventually" is usually milliseconds — but during partitions, it could be longer.
- Read-your-writes — After a write, the same client sees its own update (route to same replica or use session stickiness)
- Monotonic reads — Once you read a value, you never see an older one (don't hop between replicas)
- Causal consistency — If A causes B, everyone sees A before B (vector clocks)
⚡ Key Takeaways
- CAP: partitions are unavoidable — choose CP or AP based on your requirements
- Consensus (Raft/Paxos) lets nodes agree despite failures — basis for leader election
- Message queues decouple services and buffer traffic — essential at scale
- Eventual consistency is a spectrum — tune it (ONE, QUORUM, ALL)
- Design for failure: every network call can fail, every node can crash