Architecture kafka apache-kafka event-driven-architecture system-design microservices backend architecture distributed-systems nodejs software-engineering

Event-Driven Architecture with Apache Kafka: A Complete Guide for Developers

A complete guide to Event-Driven Architecture using Apache Kafka. Learn topics, partitions, producers, consumers, real-world patterns, and production best practices with code examples.

Panda Coding SchoolJune 10, 202612 min read

Event-Driven Architecture with Apache Kafka is one of the most powerful patterns for building scalable, decoupled backend systems. It's the backbone of platforms like LinkedIn, Uber, and Netflix.

I'll be honest with you. When I first heard the phrase "Event-Driven Architecture", I thought it was one of those buzzwords that big tech companies throw around to sound impressive.

But the more systems I built, the more I realized something was wrong with the traditional approach. Services were tightly coupled. A single API failure would cascade into chaos. Scaling one component meant scaling everything.

That's when Kafka changed the way I think about building systems.

In this article, I'll walk you through Event-Driven Architecture (EDA) and show you how Apache Kafka makes it practical, scalable, and maintainable. We'll use real diagrams and examples the whole way through.

What is Event-Driven Architecture (EDA)?

In a traditional request-response system, Service A directly calls Service B and waits for a response.

Traditional (Request-Response):

  [Order Service] ──────► [Payment Service]
                               │
                               ▼
                         (waits for response)

This works fine until:

Payment Service is slow
Payment Service is down
You need 5 more services to react to the same order

Event-Driven Architecture flips this model.

Instead of calling services directly, a service publishes an event ("something happened") and moves on. Other services subscribe and react to those events independently.

Event-Driven:

  [Order Service] ──► [Event Bus / Kafka] ──► [Payment Service]
                                          ──► [Inventory Service]
                                          ──► [Notification Service]
                                          ──► [Analytics Service]

The Order Service doesn't know or care who reacts. It just publishes the event and continues.

This is the fundamental idea.

Why Kafka?

There are plenty of message brokers out there: RabbitMQ, AWS SQS, Google Pub/Sub. But Kafka has some unique properties that make it the preferred choice for high-throughput, large-scale systems:

Feature	Kafka	RabbitMQ	AWS SQS
Throughput	Millions/sec	Thousands/sec	Thousands/sec
Message Retention	Days/weeks	Until consumed	14 days max
Replay Events	✅ Yes	❌ No	❌ No
Message Ordering	Per partition	Per queue	Only FIFO queues
Used by	Netflix, Uber, LinkedIn	Traditional apps	AWS-native apps

Kafka was originally built by LinkedIn to handle billions of events per day. It was later open-sourced and is now one of the most battle-tested distributed systems in the world.

Core Concepts of Apache Kafka

Before jumping into code, let's understand the building blocks.

Topics

A Topic is like a category or a folder for your events. Think of it as a named stream of related messages.

Topics:

  ┌─────────────────────┐
  │  Topic: "orders"    │  ← all order-related events
  └─────────────────────┘

  ┌─────────────────────┐
  │  Topic: "payments"  │  ← all payment-related events
  └─────────────────────┘

  ┌─────────────────────┐
  │  Topic: "users"     │  ← all user-related events
  └─────────────────────┘

Producers

A Producer is any service that publishes messages to a Kafka topic.

Producer:

  [Order Service]  ──publishes──►  Topic: "orders"

Consumers

A Consumer is any service that reads messages from a Kafka topic.

Consumer:

  Topic: "orders"  ──reads──►  [Payment Service]
  Topic: "orders"  ──reads──►  [Inventory Service]

Consumer Groups

Multiple consumers can be grouped into a Consumer Group. Kafka distributes messages across the group so each message is processed by exactly one consumer in the group.

Consumer Group: "payment-processors"

  Topic: "orders"
  ┌──────────┐
  │ Msg 1    │ ──► [Payment Worker 1]
  │ Msg 2    │ ──► [Payment Worker 2]
  │ Msg 3    │ ──► [Payment Worker 1]
  │ Msg 4    │ ──► [Payment Worker 2]
  └──────────┘

  Each message processed by only ONE worker.
  Workers can scale independently.

Partitions

A Topic is split into Partitions. This is Kafka's secret weapon for scalability.

Topic: "orders" with 3 Partitions

  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
  │ Partition 0 │    │ Partition 1 │    │ Partition 2 │
  │  Msg 1      │    │  Msg 2      │    │  Msg 3      │
  │  Msg 4      │    │  Msg 5      │    │  Msg 6      │
  └─────────────┘    └─────────────┘    └─────────────┘
       │                   │                   │
       ▼                   ▼                   ▼
  [Consumer 1]        [Consumer 2]        [Consumer 3]

More partitions = more parallelism = higher throughput.

The Full Architecture

Now let's zoom out and see how everything fits together in a real system.

Event-Driven System with Kafka:

  ┌──────────────────────────────────────────────────────────────┐
  │                        PRODUCERS                             │
  │  [Order Service]  [User Service]  [Payment Service]          │
  └──────────┬──────────────┬──────────────┬────────────────────┘
             │              │              │
             ▼              ▼              ▼
  ┌──────────────────────────────────────────────────────────────┐
  │                     APACHE KAFKA                             │
  │                                                              │
  │   Topic: orders    Topic: users    Topic: payments           │
  │   [P0][P1][P2]     [P0][P1]        [P0][P1][P2][P3]          │
  └──────────┬──────────────┬──────────────┬────────────────────┘
             │              │              │
             ▼              ▼              ▼
  ┌──────────────────────────────────────────────────────────────┐
  │                       CONSUMERS                              │
  │  [Inventory]  [Notifications]  [Analytics]  [Audit Logger]   │
  └──────────────────────────────────────────────────────────────┘

Each producer is completely decoupled from each consumer. Adding a new consumer (e.g., an audit logger) requires zero changes to any producer.

Real-World Example: E-Commerce Order Flow with Kafka

Let's walk through a concrete example. We'll use an e-commerce platform processing an order.

Without Kafka (Tightly Coupled)

User places order
      │
      ▼
[Order Service]
      │
      ├──► calls [Payment Service]   (sync - waits)
      │           │
      │           ├──► calls [Inventory Service]  (sync - waits)
      │           │
      │           └──► calls [Email Service]      (sync - waits)
      │
      └──► responds to User (after ALL services complete)

Problems:

If Email Service is down, the order fails
Total latency = sum of all service latencies
Hard to add new services (you have to modify Order Service)

With Kafka (Decoupled)

User places order
      │
      ▼
[Order Service] ──publishes──► Topic: "order.created"
      │
      └──► immediately responds to User: "Order received!"

Meanwhile, independently:

  Topic: "order.created"
        │
        ├──► [Payment Service]    processes payment
        ├──► [Inventory Service]  reserves stock
        ├──► [Email Service]      sends confirmation email
        └──► [Analytics Service]  records the event

Benefits:

Order Service responds instantly, no waiting around
If Email Service is down, it catches up later when it restarts
Adding a new service requires zero changes to Order Service

Kafka Code Example: Producer and Consumer in Node.js

Let's write a simple Kafka producer and consumer using Node.js with the kafkajs library.

Setup

npm install kafkajs

Kafka Connection

// kafka.ts
import { Kafka } from "kafkajs";
 
export const kafka = new Kafka({
  clientId: "ecommerce-app",
  brokers: ["localhost:9092"],
});

Producer: Publishing an Order Event

// order-producer.ts
import { kafka } from "./kafka";
 
const producer = kafka.producer();
 
interface OrderEvent {
  orderId: string;
  userId: string;
  items: { productId: string; quantity: number }[];
  totalAmount: number;
}
 
async function publishOrderCreated(order: OrderEvent) {
  await producer.connect();
 
  await producer.send({
    topic: "order.created",
    messages: [
      {
        key: order.orderId,
        value: JSON.stringify(order),
      },
    ],
  });
 
  console.log(`Order event published: ${order.orderId}`);
 
  await producer.disconnect();
}
 
publishOrderCreated({
  orderId: "ORD-1001",
  userId: "USR-42",
  items: [{ productId: "PROD-5", quantity: 2 }],
  totalAmount: 1999,
});

Consumer: Payment Service Reacting to Order Events

// payment-consumer.ts
import { kafka } from "./kafka";
 
const consumer = kafka.consumer({ groupId: "payment-service" });
 
async function startPaymentConsumer() {
  await consumer.connect();
  await consumer.subscribe({ topic: "order.created", fromBeginning: false });
 
  await consumer.run({
    eachMessage: async ({ message }) => {
      if (!message.value) return;
 
      const order = JSON.parse(message.value.toString());
 
      console.log(`Processing payment for order: ${order.orderId}`);
      console.log(`Amount: ₹${order.totalAmount}`);
 
      // Process payment logic here
      await processPayment(order.orderId, order.totalAmount);
    },
  });
}
 
async function processPayment(orderId: string, amount: number) {
  console.log(`Payment of ₹${amount} processed for order ${orderId}`);
 
  // After processing, publish a new event for downstream services
  const producer = kafka.producer();
  await producer.connect();
 
  await producer.send({
    topic: "payment.completed",
    messages: [
      {
        key: orderId,
        value: JSON.stringify({ orderId, status: "success" }),
      },
    ],
  });
 
  await producer.disconnect();
}
 
startPaymentConsumer();

Consumer: Notification Service

// notification-consumer.ts
import { kafka } from "./kafka";
 
const consumer = kafka.consumer({ groupId: "notification-service" });
 
async function startNotificationConsumer() {
  await consumer.connect();
 
  // Subscribe to multiple topics
  await consumer.subscribe({ topic: "order.created", fromBeginning: false });
  await consumer.subscribe({
    topic: "payment.completed",
    fromBeginning: false,
  });
 
  await consumer.run({
    eachMessage: async ({ topic, message }) => {
      if (!message.value) return;
 
      const payload = JSON.parse(message.value.toString());
 
      if (topic === "order.created") {
        console.log(`Sending order confirmation email for ${payload.orderId}`);
      }
 
      if (topic === "payment.completed") {
        console.log(`Sending payment receipt for ${payload.orderId}`);
      }
    },
  });
}
 
startNotificationConsumer();

Notice how NotificationService and PaymentService are completely independent. Neither one knows the other exists. Both react to the same event.

Event Sourcing with Kafka: Taking It Further

One of the most powerful patterns enabled by Kafka is Event Sourcing.

Instead of storing just the current state, you store every event that led to that state.

Traditional Database:

  Order Table:
  ┌─────────┬──────────┬────────┐
  │ orderId │  status  │ total  │
  ├─────────┼──────────┼────────┤
  │ ORD-001 │ SHIPPED  │ ₹1999  │
  └─────────┴──────────┴────────┘
  (only current state, history lost)


Event Sourcing with Kafka:

  Topic: "order.events"
  ┌─────────────────────────────────────────────┐
  │  order.created   → orderId: ORD-001          │
  │  payment.done    → orderId: ORD-001          │
  │  order.packed    → orderId: ORD-001          │
  │  order.shipped   → orderId: ORD-001          │
  └─────────────────────────────────────────────┘
  (full history, can replay and rebuild state)

This gives you:

Full audit trail of everything that happened
Time travel: rebuild system state at any point in time
Debug production issues by replaying events
New services can read the entire event history and build their own view

Common Apache Kafka Patterns

Pattern 1: Fan-Out

One producer, multiple independent consumers.

[Order Service] ──► "order.created" ──► [Payment Service]
                                    ──► [Inventory Service]
                                    ──► [Email Service]
                                    ──► [Analytics Service]

Use when: Multiple services need to react to the same event independently.

Pattern 2: Event Pipeline (Chain)

Events flow through a series of processing stages.

[Raw Data] ──► "raw.events" ──► [Transformer] ──► "clean.events" ──► [Aggregator] ──► "reports"

Use when: You need to process, enrich, or transform data through multiple stages.

Pattern 3: CQRS (Command Query Responsibility Segregation)

Separate the write model (commands) from the read model (queries).

User Action
    │
    ▼
[Write API] ──► Kafka ──► [Event Processor] ──► Write DB (Postgres)
                      ──► [Read Model Builder] ──► Read DB (Elasticsearch/Redis)
                                                         │
                                                         ▼
                                                   [Read API] ◄── User Query

Use when: Your read patterns and write patterns have very different requirements.

When to Use (and Not Use) Apache Kafka

Use Kafka When:

You have multiple services that need to react to the same events
You need high throughput (millions of messages/second)
You need message replay to re-read old messages
You need decoupling between services
You're building microservices that should evolve independently

Don't Use Kafka When:

You have a simple monolith, it's overkill
You need immediate synchronous responses (use REST/gRPC instead)
Your team is small and the operational overhead outweighs benefits
You're building a simple CRUD app

Kafka solves real problems, but it also introduces operational complexity. Don't reach for it just because Netflix uses it.

Apache Kafka in Production: Key Things to Get Right

Running Kafka in production is not trivial. Here are the things I've learned the hard way:

1. Message Schema Management

Always use a schema for your messages. Without it, a producer change will silently break all consumers.

// Define strict schemas for your events
interface OrderCreatedEvent {
  eventType: "order.created";
  eventVersion: "1.0";
  orderId: string;
  userId: string;
  totalAmount: number;
  createdAt: string; // ISO timestamp
}

Use Apache Avro or JSON Schema with a Schema Registry for larger teams.

2. Idempotent Consumers

Networks fail. Kafka may deliver a message more than once. Your consumers must handle duplicate events gracefully.

async function processPayment(orderId: string, amount: number) {
  // Check if already processed (idempotency check)
  const existing = await db.payments.findOne({ orderId });
 
  if (existing) {
    console.log(`Payment for ${orderId} already processed. Skipping.`);
    return;
  }
 
  // Process and save atomically
  await db.payments.create({ orderId, amount, status: "completed" });
}

3. Dead Letter Queue (DLQ)

When a message fails to process repeatedly, don't lose it. Route it to a Dead Letter Topic for investigation.

Normal Flow:
  Topic: "orders" ──► [Consumer] ──► processes successfully

Failure Flow:
  Topic: "orders" ──► [Consumer] ──► fails 3 times ──► Topic: "orders.dlq"
                                                              │
                                                              ▼
                                                       [Ops Team alerts,
                                                        manual review]

4. Monitor Consumer Lag

Consumer lag is the number of unprocessed messages in a topic. If it keeps growing, your consumers can't keep up.

# Check consumer group lag
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --describe --group payment-service

Summary: Event-Driven Architecture with Kafka

Let's recap what we covered:

Concept	What it does
Topic	Named stream of related events
Producer	Publishes events to a topic
Consumer	Reads and reacts to events
Consumer Group	Distributes messages across multiple workers
Partition	Enables parallelism and high throughput
Event Sourcing	Store events instead of just state
Dead Letter Queue	Handle unprocessable messages safely

Final Thoughts

Event-Driven Architecture with Kafka is not about adding complexity. It's about removing the wrong kind of complexity.

Tight coupling between services is the kind of complexity that quietly grows over time and eventually brings a system to its knees. Kafka replaces that with an event log, a single source of truth that every service can read at its own pace.

It does take some upfront investment to understand topics, partitions, consumer groups, and schemas. But once it clicks, you'll find yourself naturally thinking in events. You'll ask "what happened?" instead of "what should I call?"

Start small. Pick one flow in your system. Replace a synchronous call with an event. See how it feels.

That's how it starts for most engineers. One event at a time.

Happy Coding! 🚀

Written by

Panda Coding School

AI Engineer & Builder

I build production AI systems and write about what actually works, what broke, and the lessons in between. Follow along for practical AI engineering, no hype.

Enjoyed this article?

Get more AI engineering insights delivered to your inbox.

Architecture

Production-Grade AI Agent Architecture: Patterns That Actually Work

3 min read

Tutorials

7 Design Patterns Every Developer Should Know

7 min read