1: Scale from Zero to Millions of Users
Load Balancer
A load balancer is a system that distributes incoming network or application traffic across multiple backend servers. Its primary purpose is to improve availability, reliability, and performance by preventing any single server from becoming a bottleneck or point of failure.
Core Components
- Frontend System
- Backend Pool
- Health Checks
- Routing Logic
Routing Logic
- Round Robin
- Least Connections
- Weighted Distribution
- IP Hash
“Any server should be able to handle any request independently”
Stateless vs Stateful Sessions
Why is Stateless Backend Preferred?
- Horizontal Scalability
- Simpler Load Balancing
- Fault Tolerance
- Easier Deployments
What does Stateless men?
- Server does not store client-specific state between requests
- State in instead stored in shared systems:
- Database
- Cache
- Object Storage
When Stateful Backends are Used?
- Real-time Systems
- WebSockets
- Multiplayer Games
- Live Collaboration Tools
- High Performance Caching
- Reduces Latency
- Avoids repeated database hits
- Legacy Systems
- Specialized workloads
- Long-running processes
- In-memory computation engines
Subtle Tradeoff Stateless systems shift complexity rather than eliminate it. Instead of:
- Managing in-memory sessions You now manage:
- Distributed consistency
- Cache invalidation
- Network latency to state stores So the system becomes:
- Operationally simpler at the server level
- Architecturally more distributed overall**
Use stateless servers unless you have a clear reason not to.
Database Replication
- Usually using a master/slave relationship between the original(master) and the copies(slaves).
- A master database generally only supports write operations. A slave database gets the copies of the data from the master database and only supports read operations.
- Most applications require a much higher ration of reads to writes; thus, the number of slave databases is usually larger than the number of master databases
- Advantages
- Better performance: It allows more queries to be processed in parallel
- Reliability: If one of the database servers is destroyed, data is still preserved
- High Availability: Even if a database is offline I can access data stored in another database server
- What if a database goes offline?
- If only one slave database is available and it goes offline, read operations will be directed to the master database temporarily. As soon as the issue is found, a new slave will replace the old one. If multiple slaves are available, the read operations are redirected to a healthy slave.
- If the master goes offline, a slave will be promoted to be the new master and a new slave will replace the old one for data replication immediately.
- In Prod, promoting a new master is more complicated as the data in a slave db might not be up to date. It needs to be updated using recovery scripts. Other replication methods: multi-masters, circular replication could help.
Why Slaves Have Stale Data
Replication is asynchronous — the master sends write operations to slaves after locally committing the write. If the master crashes before replication completes, promoted slaves lose those updates.
- Master writes at Time 0 → crashes at Time 0.3 → slave still has old data
- Recovery scripts read the master’s binary log to find missing operations and replay them on the slave before promotion
Multi-Master Replication
- Multiple masters accept both reads and writes
- Changes on any master replicate to all others
- If one master fails, others continue accepting writes — no promotion needed
- Tradeoff: Write conflicts when the same data is modified on multiple masters simultaneously Resource: https://en.wikipedia.org/wiki/Multi-master_replication Example:
- 3 masters: A, B, C
- Normal: A writes
balance=100, B writesname="Alice"→ all sync - A crashes: B and C keep running, app routes writes to them
- A recovers: syncs missed writes from B or C
- Conflict: if A writes
balance=200and B writesbalance=150at same time, which wins?
Circular Replication
- Ring topology: A → B → C → A
- Each node is both master and slave
- If one node fails, the ring reconfigures to bypass it
- Tradeoff: Latency compounds around the ring; propagation delays mean data may be inconsistent Example:
- Ring A → B → C → A
- A receives
INSERT users(1,"Alice")→ syncs to B → syncs to C - B crashes: ring reconfigures to A → C → A
- Propagation delay: if each hop = 5ms, a write on A takes ~10ms to reach C
- In a 10-node ring, last node sees writes ~45ms after they committed on first node Resource: https://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-replication-multi-master.html
Cache
- A cache is a temporary storage area that stores the result of expensive responses or frequently accessed data in memory so that subsequent requests are served more quickly.
- Resource: https://codeahoy.com/2017/08/11/caching-strategies-and-how-to-choose-the-right-one/
- Read-Through Cache - First check if the cache has the available response, if yes return that. If not, query the db, store the response in cache and send it back to client.
- Considerations
- Decide when to use cache: When data is read frequently but modified infrequently. Not ideal for persisting data
- Expiration Policy: Not too short otherwise the system will have to reload data from db too frequently. Not too long otherwise the data will become stale.
- Consistency: Involves keeping the data store and the cache in sync. Resource: R. Nishtala, “Facebook, Scaling Memcache at, 10th USENIX Symposium on Networked Systems Design and Implementation
- Mitigating failures: A single cache server represents a potential single point of failure.
- Eviction Policy: Once the cache is full, any requests to add items to the cache might cause the existing items to be removed. LRU, LFU, FIFO, etc can be adopted to satisfy different use cases
Contend Deliver Network (CDN)
- A CDN is a network of geographically dispersed servers used to deliver static content. CDN servers cache static content like images, videos, CSS, etc.
- Dynamic Content Caching enables the aching of HTML pages that are based on request path, query strings, cookies and request headers Resource: https://aws.amazon.com/cloudfront/dynamic-content/
- Considerations
- Cost: We are charged for data transfers in and out of the CDN. Caching infrequently used assets provides no significant benefits.
- Setting an apt cache expiry: Similar logic to that of cache expiry
- CDN fallback: We should consider how the application copes with CDN failure. Clients should be able to detect a problem and request resources from the origin.
- Invalidating files