Scaling Postgres at Cloudflare

As an avid and long-term user of Postgres, I’ve always admired teams that operate this robust open-source database at scale. Recently, I was introduced to a small team at Cloudflare that operates a Postgres service at a truly gargantuan scale. My understanding of their work was deepened through a conversation with Vignesh Ravichandran, Engineering Manager at Cloudflare, who shared some eye-opening metrics. Vignesh and his team manage a relational database service for a powerhouse company that processes an astounding 45 million HTTP requests per second. Furthermore, among all websites utilizing a CDN, a staggering 76% rely on Cloudflare.

One thing that stood out to me was the team’s significant contributions to Postgres-related software. They have made substantial modifications to PgBouncer, a lightweight connection pooler for Postgres that dramatically reduces the overhead of creating new connections. They’ve also worked extensively on Stolon, an open-source, cloud-native manager for high availability, a tool that ensures a database remains accessible even during system failures. These contributions, made by a lean, globally dispersed team of just five, cover everything from system building and support to guiding best practices for database usage.

The rise of vector search is being driven by the growing popularity of Generative AI and LLMs, which require efficient ways to store and search for large amounts of unstructured data. Postgres is a proven, powerful, and versatile foundation for vector search, offering AI teams scalability, security, and flexibility. pgvector is an open-source PostgreSQL extension that allows for vector similarity searches, making it a convenient option for AI teams already using Postgres. pgvector is scalable, well-documented, and already being used in production. As a testament to the vibrancy of the Postgres ecosystem, there is another option for vector search: pg_embedding is an extension that enables Hierarchical Navigable Small World (HNSW) algorithm for vector similarity search in Postgres.

Vignesh and his team continue to make their work available to the wider community. For a more detailed look into their operations, I recommend exploring their blog post and their website, which hosts key resources related to their work.

If you enjoyed this post please support our work by encouraging your friends and colleagues to subscribe to our newsletter:

Share this:

Like this:

Discover more from Gradient Flow