DevLog #8: Scaling a Browser MMO from 10 to 10,000 Players

Most browser games die under load. Their architecture was never designed for concurrency. VvW was built async-first, with Redis at every hot path and a horizontal scaling plan from day one. Here's how it works.

The Tech Stack Choice

VvW runs on FastAPI (Python async web framework) with SQLAlchemy async ORM connecting to PostgreSQL in production (SQLite in development). Redis handles sessions, cooldowns, rate limiting, and short-lived caches. APScheduler manages background jobs (season resets, daily quest generation, leaderboard snapshots).

Why FastAPI over Django or Flask? Pure async from the ground up. A single FastAPI worker can handle hundreds of concurrent requests without blocking. For a game where every player action is an API call, this matters enormously.

Why Async Matters for MMOs

In a synchronous web framework, a request that does 3 database queries takes ~15ms per query = ~45ms total, and blocks a worker thread the entire time. In an async framework, those 3 queries happen concurrently — total time drops to ~15ms, and the worker is freed to handle other requests while waiting for I/O.

For a game with 1,000 concurrent players each making 1 request per second, this difference is the gap between a server that handles the load and one that queues indefinitely. VvW uses async def on every endpoint and await on every database call.

Redis Caching Strategy

Redis sits between FastAPI and PostgreSQL for every hot-path read:

Cache Type	TTL	What's Cached
Static JSON (lru_cache)	Forever (process life)	items.json, monsters.json, locations.json
Leaderboards	5 minutes	Top 20 per category
Character stats	60 seconds	Derived stat calculations
Clan info	5 minutes	Clan name, level, member count
World Boss HP	10 seconds	Current HP (hot during raids)
Cooldowns	Per action (1h–24h)	dungeon/npc/mission last timestamps

The most important cached object is static JSON. Loading items.json (921 items, ~450KB) from disk on every item request would be catastrophic. Python's @lru_cache loads it once at process start and serves it from memory for the process lifetime.

Database Optimization

Every foreign key column has an index. Every commonly-queried column has an index. The most-hit query pattern — "get character by ID, then get their inventory" — hits indexes on both tables and returns in under 2ms even with 100k rows.

We use SQLAlchemy's selectinload for relationship loading instead of lazy loading. Lazy loading in a async context causes the N+1 query problem. selectinload fires two queries total instead of N+1.

KEY OPTIMIZATION

The single biggest performance improvement was adding a composite index on (character_id, item_id) for inventory queries. Inventory reads dropped from 12ms to 0.8ms after this index was added.

Horizontal Scaling Plan

The path from 1 server to 10 servers is already designed:

Phase 1 (current): Single Hetzner VPS, nginx as reverse proxy, 4 FastAPI workers via Uvicorn
Phase 2 (1k players): Add read replica PostgreSQL. Route all GET requests to replica, only writes to primary
Phase 3 (5k players): Separate Redis to dedicated instance. Add 2nd app server behind load balancer. Sticky sessions via Redis (not in-memory)
Phase 4 (10k+ players): PostgreSQL connection pooling via PgBouncer. CDN for all static assets including the 1,426 programmatic pages. WebSocket server separation

The key architectural decision that makes this possible: no local state in FastAPI workers. Every piece of state lives in PostgreSQL or Redis. This means any worker can handle any request — perfect for horizontal scaling.

Load Test Results

We simulated 500 concurrent users using Locust, each performing a realistic session: login → view dashboard → 5 hunts → check inventory → claim daily mission → logout.

Metric	Result	Target
Requests/second	847 req/s	500 req/s
p50 response time	28ms	<100ms
p95 response time	112ms	<300ms
p99 response time	289ms	<500ms
Error rate	0.02%	<0.1%
DB connection pool exhaustion	0 events	0 events

The bottleneck at 500 concurrent users was the PostgreSQL connection pool limit (20 connections). We increased it to 50 and retested — p95 dropped to 87ms. Phase 2 scaling with PgBouncer will address this permanently.

CDN for Programmatic Pages

Our 1,426 programmatic SEO pages (items, monsters, locations) are ideal CDN candidates — they're static HTML generated from JSON data. On a CDN, each page serves from a regional edge node in under 20ms globally. We serve these from nginx-cached static files in Phase 1 and will move to a CDN in Phase 3.

Built to Scale With You

VvW's architecture was designed for the long term. Play today and experience a game built to handle whatever growth comes.

Join the Beta →

DevLog #8: Scaling a Browser MMO from 10 to 10,000 Players

The Tech Stack Choice

Why Async Matters for MMOs

Redis Caching Strategy

Database Optimization

Horizontal Scaling Plan

Load Test Results

CDN for Programmatic Pages

Built to Scale With You

DevLog #6: Account Security & Data Protection

DevLog #7: Art & UI Design for a Text RPG

DevLog #5: Fair Monetization Philosophy

Best Browser RPGs 2026: Our Top Picks