Morteza Taghdisi

Writing9 min read
Abstract technical illustration representing three communication models for backend services
Software EngineeringApril 5, 2026

Choosing Between Kafka, RabbitMQ, and REST

Series

Kafka Mastery

1 of 3 in the series

Article 1 of 3

A series about using Kafka in real JVM systems, with each article anchored on a concrete failure mode or design decision.

Choosing between Kafka, RabbitMQ, and REST is a design decision, not a stack preference. This article gives a decision rule that holds up before a line of code is written.

kafkarabbitmqrestarchitectureevent-driven

Most Kafka pain starts before any Kafka configuration is written. It starts with a tool decision that was never really made.

Before the rest of this series is useful, the choice to use Kafka has to survive a second look.

The Wrong Starting Point

Most Kafka tutorials start with a producer, a consumer, and a topic. That order assumes the decision to use Kafka has already been made. In practice, the decision is usually inherited from a meeting where someone said "we should decouple this" and nobody pushed back.

The result is a system that uses Kafka for problems Kafka does not solve, while quietly accepting all of Kafka's operational cost: consumer lag, rebalancing, offset management, schema breakage, dual-write, observability gaps. Each of those is a real engineering problem. None of them are problems a REST call has.

The senior-level question is not "how do I use Kafka." It is "what is the smallest tool that solves this communication problem, and would Kafka actually pay for itself here."

Three Tools, Three Different Jobs

REST, RabbitMQ, and Kafka are often grouped under "ways services talk to each other." That grouping hides the differences that matter.

REST: Synchronous Request and Response

REST is a request and a response. The caller waits. The connection is open while the work happens. The caller learns the outcome immediately, including failures.

plaintext
Client -> POST /orders -> Server -> 201 Created

REST is the right answer when:

  • The caller needs the result before continuing
  • The work is short enough to fit inside an HTTP timeout
  • One service is asking another for an answer

REST has no replay, no buffering, and no fan-out. If the receiver is down, the call fails.

RabbitMQ (Classic Queue): Task Distribution

A classic RabbitMQ queue is a work queue. A producer enqueues a task. One consumer in a pool picks it up, processes it, and acknowledges it. The message is then removed.

plaintext
Producer -> [ task ] [ task ] [ task ] -> one of N workers

RabbitMQ is the right answer when:

  • Work needs to happen reliably but not synchronously
  • The task has one logical owner: one job, one worker, one outcome
  • Backpressure and retry need to be queue-managed
  • Throughput is moderate

A classic queue is not an event log. Once a task is consumed, it is gone. Other systems cannot react to the same task later.

Kafka: Event Log With Independent Consumers

Kafka is an append-only log of events partitioned across brokers. Producers write events to topics. Consumers read at their own pace, track their own position with offsets, and can replay history within the retention window.

plaintext
Producer -> [ event log ] -> consumer group A
                          -> consumer group B
                          -> consumer group C

Kafka is the right answer when:

  • Multiple independent systems need to react to the same event
  • Events have value beyond the moment they happen (audit, analytics, replay)
  • Throughput is high or the event volume is sustained
  • Consumers evolve at different rates and need to read history

Kafka is not a better REST. It is not a better RabbitMQ. It is a different tool that happens to share a network protocol with both.

A Decision Table

This table is scoped to classic RabbitMQ queues, not RabbitMQ Streams or fanout exchange designs. Those change parts of the picture and are out of scope here.

Use caseRESTRabbitMQ (classic)Kafka
Request and response, user is waiting
Yes
Limited
No
Background jobs, one worker per task
No
Yes
Limited
Event streaming with multiple consumers
No
Limited
Yes
Replay past events
No
No
Yes
High-throughput sustained event flow
No
Limited
Yes
One service asking another for data
Yes
No
No

Most production communication patterns map cleanly onto exactly one row.

The One-Line Decision Rule

Use Kafka when multiple systems need to react independently to the same event over time. Not when one system just needs a response from another.

If a feature can be described as "A asks B for X," it is a REST call. If it can be described as "A hands off a job to be done once," it is a queue. If it can be described as "A publishes that something happened, and several systems care about it now or later," it is an event log.

A Failure Story: Kafka Where REST Was Enough

Consider an order service handling about 50 requests per second over REST. The flow is short: validate the order, write it to PostgreSQL, return 201 Created. End-to-end latency sits comfortably under 100ms. The team wants to "scale" and "decouple."

They introduce Kafka. The order endpoint now writes the order to PostgreSQL and publishes an OrderCreatedEvent. Downstream services consume the event to send confirmation emails and update an internal dashboard. The HTTP response no longer waits for those things.

Six months in, the team is dealing with:

  • A 200 to 800ms tail in confirmation email delivery, blamed on consumer lag during peak hours
  • A duplicate email incident traced to an offset commit failure after a deploy
  • A reconciliation job written to find orders whose OrderCreatedEvent never published due to a crash between the database commit and the Kafka send
  • An internal dashboard that shows stale data after every consumer-group rebalance
  • A new on-call rotation specifically for Kafka health

None of those problems existed under REST. The system handled 50 requests per second on a thread-per-request model and was easy to reason about. Kafka did not scale the system. It added every operational cost Kafka brings, in exchange for solving a coupling problem that was never really there.

The honest postmortem is not "we configured Kafka wrong." It is "Kafka was the wrong tool for this problem."

When Kafka Earns Its Complexity

A short checklist worth running through before adopting Kafka:

  • Multiple systems need to react to the same event, and that list is expected to grow
  • At least one consumer needs to replay events later (audit, debugging, late-arriving systems)
  • Sustained event volume is high enough that synchronous coupling becomes a bottleneck
  • Producers and consumers evolve on independent schedules
  • Eventual consistency is acceptable for the affected behaviors

If most of those are true, Kafka is the right tool and the rest of this series is for you. If only one or two are true, REST or a simple queue is probably the better answer, and Kafka can be added later when the second consumer actually exists.

A Lightweight ADR For This Decision

The decision deserves a paper trail. A short ADR keeps the reasoning visible the next time someone asks "why did we pick Kafka here."

yaml
# ADR: Communication model for the order flow
status: accepted
context: |
  The order flow needs to inform multiple systems that an order
  was created: payments, notifications, analytics. Today only
  payments needs it. Notifications is planned this quarter.
  Analytics is a maybe.
decision: |
  Use Kafka for OrderCreatedEvent.
  Reasons:
    - More than one consumer is expected within the next quarter
    - Replay is required for the analytics use case if it ships
    - Event volume is projected to grow with order growth
consequences:
  - We accept the operational cost of running Kafka in production
  - We accept that publishing is asynchronous and best-effort
    until the Outbox Pattern is introduced
  - We accept the need for consumer-side idempotency
alternatives_considered:
  - REST callbacks: rejected, does not support replay or fan-out
  - Classic queue: rejected, does not support multiple
    independent consumers cleanly

A worked rejection looks like this:

yaml
# ADR: Communication model for the user-profile read flow
status: accepted
context: |
  The product page needs the user's display name and avatar.
  The data lives in the user service.
decision: |
  Use REST. The product page calls the user service synchronously.
consequences:
  - Latency depends on the user service. Cache where it matters.
alternatives_considered:
  - Kafka: rejected, no fan-out requirement and no replay value.
    Pushing this through Kafka would add operational cost
    for no behavioral benefit.

The point of these ADRs is not formality. It is that "we use Kafka here" stops being a stack convention and becomes a decision with reasons.

What Most People Get Wrong

Three claims worth pushing back on:

  • Kafka is a better REST. It is not. REST gives the caller an answer. Kafka gives the caller a write to a log.
  • Async is inherently scalable. It is not. Asynchronous systems move the bottleneck. They rarely remove it. Consumer lag, retries, and dead-letter queues all need capacity too.
  • Decoupling is free. It is not. Decoupled systems are harder to debug, harder to test, and harder to operate. The price is paid by every engineer who reads the system later.

If a system can be reasoned about as a chain of REST calls, the cost of Kafka is unlikely to be justified.

What Comes Next

The next article builds the smallest Kafka system worth running: a Spring Boot producer, a Spring Boot consumer, a single-broker cluster in KRaft mode, and Kafka UI from the first commit. The argument is not "build your first producer." It is that running Kafka without local visibility tooling is the fastest way to make every later debugging session ten times harder.