ā ļø This post links to an external website. ā ļø
I feel like the tech world lives in two camps.
- One camp chases buzzwords.
This camp tends to adopt whateverās popular without thinking hard about whether itās appropriate. They tend to fall for all the purported benefits the sales pitch gives them - real-time, infinitely scale, cutting-edge, cloud-native, serverless, zero-trust, AI-powered, etc.
You see this everywhere in the Kafka world: Streaming Lakehouseā¢ļø, Kappaā¢ļø Architecture, Streaming AI Agents1.
This phenomenon is sometimes known as resume-driven design. It can also be a scale cargo-cult. Modern practices actively encourage both. Consultants push āinnovative architecturesā stuffed with vendor tech via āinsightā reports2. System design interviews expect you to design Google-scale architectures that are inevitably at a scale 100x higher than the company youāre interviewing for would ever need. Career progression rewards you for replatforming to the Hot New Stackā¢ļø, not for being resourceful.
- The other camp chases common sense
This camp is far more pragmatic. They strip away unnecessary complexity and steer clear of overengineered solutions. They reason from first principles before making technology choices. They resist marketing hype and approach vendor claims with healthy skepticism.
Historically, it has felt like Camp 1 definitively held the upper hand in sheer numbers and noise. Today, it feels like the pendulum may be beginning to swing back, at least a tiny bit. Two recent trends are on the side of Camp 2:
Trend 1 - the āSmall Dataā movement. People are realizing two things - their data isnāt that big and their computers are becoming big too. You can rent a 128-core, 4 TB of RAM instance from AWS. AMD just released 192-core CPUs this summer. That ought to be enough for anybody.3
Trend 2 - the Postgres Renaissance. The space is seeing incredible growth and investment4. In the last 2 years, the phrase āJust Use Postgres (for everything)ā has gained a ton of popularity. The basic premise is that you shouldnāt complicate things with new tech when you donāt need to, and that Postgres alone solves most problems pretty well. Postgres competes with purpose-built solutions like:
- Elasticsearch (functionality supported by Postgresā
tsvector/tsquery)- MongoDB (
jsonb)- Redis (
CREATE UNLOGGED TABLE)- AI Vector Databases (
pgvector,pgai)- Snowflake / OLAP (
pg_lake,pg_duckdb,pg_mooncake)and⦠Kafka (this blog).
The claim isnāt that Postgres is functionally equivalent to any of these specialized systems. The claim is that it handles 80%+ of their use cases with 20% of the development effort. (Pareto Principle)
When you combine the two trends, the appeal becomes obvious. Postgres is a battle-tested, well-known system that is simple, scalable and reliable. Pair it with todayās powerful hardware and you quickly begin to realize that, more often than not, you do not need the state-of-the-art highly optimized and complex distributed system in order to handle your organizationās scale.
Despite being somebody who is biased towards Kafka, I tend to agree. Kafka is similar to Postgres in that itās stable, mature, battle-tested and boasts a strong community. It also scales a lot further. Despite that, I donāt think itās the right choice for a lot of cases. Very often I see it get adopted where it doesnāt make sense.
A 500 KB/s workload should not use Kafka. There is a scalability cargo cult in tech that always wants to choose āthe best possibleā tech for a problem - but this misses the forest for the trees. The ābest possibleā solution frequently isnāt a technical question - itās a practical one. Adriano makes an airtight case for why you should opt for simple tech in his PG as Queue blog (2023) that originally inspired me to write this.
Enough background. In this article, we will do three simple things:
- Benchmark how far Postgres can scale for pub/sub messaging - # PG as a Pub/Sub
- Benchmark how far Postgres can scale for queueing - # PG as a Queue
- Concisely touch upon when Postgres can be a fit for these use cases - # Should You Use Postgres?
I am not aiming for an exhaustive in-depth evaluation. Benchmarks are messy af. Rather, my goal is to publish some reasonable data points which can start a discussion.
(while this article is for Postgres, feel free to replace it with your database of choice)
continue reading on topicpartition.io
If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.