We Scaled Postgres to 10M Rows/Day. Here's Our Playbook.
Before reaching for a new database, we exhausted what Postgres could do. Partitioning, connection pooling, and smart indexing got us further than most people think possible.
Before reaching for a new database, we exhausted what Postgres could do. Partitioning, connection pooling, and smart indexing got us further than most people think possible.
When our write volume hit 5M rows/day, the team started talking about migrating to a time-series database. Instead, we spent two weeks optimizing our Postgres setup.
Partitioning
We partitioned our events table by month using declarative partitioning. Queries that used to scan billions of rows now only touch the relevant partition. Our p99 query latency dropped from 2.3s to 180ms.
Connection Pooling with PgBouncer
We were running out of connections with 200 application instances. PgBouncer in transaction mode reduced our active connections from 800 to 40, and our connection-related errors dropped to zero.
Partial Indexes
Instead of indexing entire columns, we created partial indexes for the queries we actually run. CREATE INDEX idx_active_users ON users (email) WHERE deleted_at IS NULL saved us 60% of index storage.
BRIN Indexes for Time-Series Data
For our append-only event tables, BRIN indexes are perfect. They're tiny (1000x smaller than B-tree) and nearly as fast for range queries on naturally ordered data.
The Lesson
Postgres can handle more than you think. The urge to adopt a new database is often a symptom of not understanding the one you already have.