>_ Golang Step By Step
Sr. Software Engineer

System Design Fundamentals

Load balancing, caching, CDNs, and the building blocks of scalable systems

# Why System Design?

As a Senior Engineer, you're expected to design systems, not just write code. System design is about making trade-offs — there's no perfect solution, only the best fit for your constraints.

The key question: How do you build a system that serves millions of users, stays available when things fail, and evolves as requirements change?

# Load Balancing

A load balancer sits between clients and servers, distributing traffic to prevent any single server from becoming a bottleneck.

Client Requests
       │
       ▼
┌─────────────┐
│ Load Balancer│
└──────┬──────┘
    ┌──┴──┬──────┐
    ▼     ▼      ▼
 Server  Server  Server
   A       B       C

Key Strategies

  • Round Robin — rotate through servers sequentially
  • Least Connections — route to the server with fewest active requests
  • IP Hash — same client IP always hits the same server (sticky sessions)
  • Weighted — more powerful servers get more traffic

# Caching

Caching stores frequently accessed data in a faster layer (memory) to avoid expensive recomputation or database queries.

Cache Layers

Browser Cache → CDN Edge → App Cache (Redis) → Database
  (~0ms)       (~5ms)       (~1ms)              (~10-100ms)

Cache Patterns

  • Cache-Aside — App checks cache first, fills on miss (most common)
  • Write-Through — Writes go to cache AND database together
  • Write-Behind — Writes go to cache only, async flush to DB (risky)
  • Read-Through — Cache itself fetches from DB on miss

# CDN (Content Delivery Network)

CDNs distribute static content to edge servers worldwide. A user in Tokyo gets served from an Asian edge node instead of a US-based origin server — cutting latency from ~150ms to ~5ms.

Push vs Pull CDN

  • Pull — CDN fetches from origin on first request, then caches. Simple, but first request is slow.
  • Push — You upload content to CDN proactively. Better for content you know will be popular.

# Horizontal vs Vertical Scaling

  • Vertical (Scale Up) — Add more CPU/RAM to one machine. Simple but has a ceiling and a single point of failure.
  • Horizontal (Scale Out) — Add more machines. No ceiling, better fault tolerance, but adds complexity (load balancing, data consistency).

Most real-world systems use both: vertically scale each machine to a reasonable size, then horizontally scale the fleet.

⚡ Key Takeaways

  • System design is about trade-offs, not perfect solutions
  • Load balancers distribute traffic; health checks remove unhealthy nodes
  • Multi-layer caching (browser → CDN → app → DB) dramatically reduces latency
  • CDNs bring content closer to users geographically
  • Prefer horizontal scaling for large systems — it removes single points of failure
  • Always think about: latency, throughput, availability, consistency
practice & review