How the canonical System Design Interview Vol. II architecture maps to a production-ready service — and where the textbook stops and the real engineering begins.
"A leaderboard is just a sorted set — until 50,000 players update their scores simultaneously. Then it's a distributed-systems problem."
The gap between a whiteboard diagram and a working system
You've just been handed the classic system design prompt: "Design a real-time leaderboard for a mobile game with millions of players." Every engineer who's done interview prep has seen this one. The textbook answer involves Redis sorted sets. You sketch the architecture, explain ZADD, mention ZREVRANGEWITHSCORES, describe the data model, and walk out feeling confident. Forty-five minutes well spent.
Then you sit down and actually build it. Suddenly the question reveals new layers. What happens when 50,000 players score a point in the same second? What if Redis restarts and the sorted set evaporates? How do you serve the neighborhood query — "show me the five players above and below rank #42,000" — efficiently at scale? The gap between the diagram you sketched and a service that handles production load is where real system design lives.
This post is a walkthrough of the Real-Time Gaming Leaderboard project I built in Go — a direct implementation of the architecture described in System Design Interview Vol. II. It addresses three hard problems simultaneously: write throughput at scale, sub-millisecond read latency, and durable event history. The decisions made along the way — what to batch, what to pipeline, how to recover from failures — are the substance of this article.
Alex Xu's chapter on leaderboards gives you the right foundation. Redis sorted sets are the correct data structure for this problem. ZADD is O(log N). ZREVRANGEWITHSCORES returns the top-N players in order, atomically. ZREVRANK returns a specific player's rank in O(log N) time. For a leaderboard that needs real-time rankings, this combination is compelling — it keeps the data structure sorted natively, without application-side sorting, and every operation runs in sub-millisecond latency under typical conditions.
But the chapter leaves a critical question unanswered: what is your write target, and how do you get data into it at scale? If every score increment writes directly to Redis and only to Redis, you've made Redis your source of truth. That's a fragile arrangement. Redis is an in-memory store. It can be configured with persistence, but it's designed and optimised as a cache — not a database. When it restarts, when the instance fails, when you need historical data for monthly resets, auditing, or analytics, you have nothing to fall back on.
The answer is a dual-store architecture: Redis for live rankings, MySQL for durable event history. Every score increment persists to MySQL and updates Redis. On Redis restart, the service can replay MySQL events and rebuild the sorted set. This is the architecture this implementation uses — and the extra complexity it introduces is where the interesting design work begins.
The naive dual-store approach has a fatal flaw. If every score increment triggers a direct MySQL write, a game with 50,000 concurrent players generating one score per second produces 50,000 inserts per second. MySQL is not designed for that. At that rate, the write queue backs up, replication lag grows, insert latency climbs, and the entire service degrades — not just the database tier, but every caller waiting for acknowledgment.
A typical game session isn't a uniform stream of events. Players score in bursts — kill streaks, boss encounters, time-limited challenges. Peak write load can easily be 10× or 20× average. Designing for average load leaves you exposed precisely at the moments that matter most: the events that generate spikes in traffic are usually the high-stakes moments players care about most. That's a bad time for your database to saturate.
The answer is an in-memory event batcher that sits between the HTTP handler and the database. Incoming score events are acknowledged immediately by the service and placed in a buffer. A background goroutine flushes the buffer to MySQL in bulk every 100ms, or when the buffer reaches 500 events — whichever comes first. At peak load, this reduces MySQL write operations from thousands of individual inserts per second to roughly 10 bulk inserts per second. That's a 500× reduction in database write operations — achieved without any change to the API contract or the read path.
While the batcher defers MySQL writes, Redis is updated synchronously on every score event via ZINCRBY. The sorted set always reflects current rankings — a player who scores sees their rank update immediately on the next leaderboard read. The batcher is a write-buffering layer for durability, not for ranking accuracy. Players see real-time standings; MySQL catches up in the background at a pace it can sustain.
Any service that holds in-memory buffers has a contract to honour on shutdown: drain the buffer before exit, or lose data. This implementation signals the batcher goroutine to flush remaining events and waits for confirmation before the process terminates. Skipping this step means every planned restart silently discards buffered events — a quiet, hard-to-detect data loss scenario that only surfaces when historical rankings drift from Redis state.
With both stores and the batcher in place, the complete data flow looks like this. On a score increment (POST /v1/scores/:username): the handler validates the X-Internal-Token header, increments the player's score in Redis via ZINCRBY on the current month's sorted set key, and hands the event to the batcher. The HTTP response returns immediately with a 200. The batcher accumulates events in memory and flushes them to MySQL as a bulk insert every 100ms. If MySQL is temporarily unavailable, the handler returns 500 to the caller — the Redis update is rolled back, maintaining consistency between the two stores.
On a leaderboard read (GET /v1/scores): the handler calls ZREVRANGEWITHSCORES on the current month's key and returns the top-N entries. No MySQL is touched. The read path is a single Redis operation completing in under a millisecond at typical scale. The ?limit query parameter controls N; the default is 10.
The neighborhood query (GET /v1/scores/:username) is the most interesting endpoint. It first calls ZREVRANK to get the target player's rank, then pipelines a ZREVRANGEWITHSCORES call for a range centered on that rank. Two Redis operations, pipelined into a single round trip, return the player's score alongside the players immediately above and below them in the standings. This is the feature that makes a leaderboard feel personal rather than abstract — and Redis sorted sets support it natively.
The startup recovery mechanism deserves careful attention. On boot, the service checks whether a recovery marker exists in Redis for the current month's sorted set key. If no marker is found, the service replays all MySQL score events for the current month, rebuilds the sorted set, then writes the marker. Subsequent restarts find the marker and skip the replay. This is elegant and cheap when it works correctly.
The subtle case is a multi-instance deployment. Two service instances starting simultaneously both detect the absent marker. Both begin replaying MySQL events in parallel. The result is duplicate increments applied to the sorted set — scores double, rankings corrupt. This is a classic time-of-check/time-of-use (TOCTOU) race.
The fix is to replace the check-then-set recovery marker with a Redis SET NX (set-if-not-exists) operation — a single atomic command that guarantees exactly one instance wins the race and proceeds with the replay. The losing instance detects the atomic failure, waits briefly, and reads the sorted set that the winner rebuilt. One replica runs recovery; the rest benefit from it. This is a decision that only reveals its necessity under concurrent load — another example of the gap between a clean whiteboard design and a production-hardened system.
A service that passes unit tests but falls over under real traffic is incomplete work. The implementation includes a full k6 load testing suite with three distinct scenarios. A spike test issues a sudden burst to peak request volume and measures how the service responds to instant overload. A write stress test sustains high-throughput score increments over several minutes, characterizing steady-state batcher behaviour and MySQL write rates. A read throughput test sustains leaderboard queries at scale, measuring Redis latency distribution under concurrency.
Each scenario probes a different failure mode. The spike test reveals whether the batcher's buffer can absorb burst load without dropping events. The write stress test surfaces any MySQL saturation or replication lag under sustained pressure. The read throughput test verifies that Redis sorted set operations remain in the sub-millisecond range as concurrency climbs. Running all three isn't redundant — it's triangulation.
Results stream to InfluxDB and render in a provisioned Grafana dashboard. The test runner saves both HTML and JSON reports after each run, creating a performance audit trail across iterations. When write latency trends upward across three consecutive releases, that trend is visible on the dashboard — not discovered by a player filing a bug report. The observability investment is modest; the value compounds with every release.
The current implementation runs as a single stateless Go service with one Redis instance and one MySQL instance. That architecture is sufficient for a mid-scale game and is straightforward to reason about. But the design intentionally leaves scaling levers available. Because the service is stateless — all shared state lives in Redis and MySQL — horizontal scaling requires only a load balancer in front of multiple service instances, plus the multi-instance recovery race fix described above.
Redis Cluster can shard sorted sets across nodes for very high cardinality leaderboards. Monthly MySQL range partitioning already isolates historical data into bounded slices; read replicas can serve analytics queries without touching the write primary. The Kubernetes deployment path is outlined in the repository's scalability roadmap for teams ready to move beyond Docker Compose.
The one genuine scaling constraint is worth naming directly: Redis sorted sets are single-node per key. If a monthly leaderboard for a single game grows beyond what one Redis node can serve at the required throughput, you need a different partitioning strategy — regional leaderboards, segmented sorted sets with a fan-out merge on read, or a purpose-built leaderboard database. That's a problem at very large scale; it's also the point where the architecture needs a new chapter entirely.
This project is a working implementation of a non-trivial distributed system. It demonstrates that the distance between a whiteboard diagram and production-ready code is filled with decisions — about batching windows, recovery atomicity, pipeline optimisation, and observability — that only become visible when you build the thing. System Design Interview Vol. II gives you the architecture. Building it gives you the education. The repository is open; the decisions are documented; the load tests are runnable. Start there.