kafkaorderingpartitioningreliabilityconcepts

[Kafka Ops 10] Kafka Ordering Guarantees — How Far Can You Trust Them?

Kafka guarantees ordering per partition, not per topic. We cover key-based partitioning, the producer reordering hazard (in-flight + idempotence), consumer-side ordering, and the traps that silently break order — wrapping up the 10-part 'Kafka Ops Troubleshooting' series.

Data DynamicsJune 9, 202614 min read

"Kafka guarantees message order" is only half true. One of the most common incidents you hit in production is the mystery: "I clearly sent A first, but the consumer processed B first." Payment states flip, inventory goes negative, event-sourcing aggregates break. Trace it back and nine times out of ten it starts with a misunderstanding of the scope of the ordering guarantee.

This is Part 10, the finale of the "Kafka Ops Troubleshooting" series. We'll pin down exactly how far Kafka guarantees order, and walk through the traps that silently break that guarantee.

What you'll learn in this post

The core truth of Kafka ordering: per-partition guarantee, not topic-wide

The producer reordering hazard: max.in.flight.requests.per.connection and idempotence

Why key choice defines the ordering scope, and what repartitioning breaks

The conditions under which consumer-side order holds, and the multithreading trap

Scenarios where DLQs, reprocessing, and multiple producers silently break order

The trade-off when you truly need "global ordering"

1. The core truth — order is guaranteed only within a single partition

The first sentence to burn into your memory:

Kafka guarantees message order only within a single partition. It does not guarantee order across an entire topic.

A topic is split into multiple partitions, and each partition is an independent append-only log. A record appended to a partition gets a monotonically increasing number called an offset, and consumers read in offset order. So within a single partition, write order = read order holds.

The trouble begins the moment a topic is split into 3, 6, or 12 partitions. There is no ordering relationship whatsoever between a record written to partition 0 and one written to partition 1. The two partitions may live on different brokers, are processed at different speeds, and are read in parallel by separate threads/instances.

Scope	Order guaranteed?	Why
Within the same partition	✅ Yes	append-only log + monotonic offset
Same key (default partitioner)	✅ Effectively yes	same key → routed to same partition
Topic-wide (across partitions)	❌ No	partitions are independent logs, processed in parallel

Same key → same partition — per-key ordering

So how do you satisfy a requirement like "events for order #1234 must be processed in order"? The answer is the key.

The default partitioner picks a partition from the hash of the record key modulo the partition count (roughly hash(key) % numPartitions). That means records with the same key always go to the same partition. So if you "use the order ID as the key," all events for the same order land in one partition → and within that partition, order is guaranteed. This is the strongest, most practical guarantee Kafka offers: per-key ordering.

// Same orderId → same partition → order guaranteed
producer.send(new ProducerRecord<>("orders", orderId, "CREATED"));
producer.send(new ProducerRecord<>("orders", orderId, "PAID"));
producer.send(new ProducerRecord<>("orders", orderId, "SHIPPED"));
// The three events above are appended to the same partition in the same order.
// But if orderId is null, records spread via round-robin/sticky → order NOT guaranteed!

If the key is null, records are spread across partitions (round-robin or sticky partitioning), and per-key order no longer holds. If order matters, always set a key.

2. The producer-side reordering hazard — in-flight and idempotence

Even with partitions and keys set correctly, order can still flip. The most notorious trap is reordering on producer retry.

Why reordering happens

For throughput, a producer can keep several requests in flight at once. The setting that controls this is max.in.flight.requests.per.connection (default 5). Combine that with retries > 0 (enabled by default) and this scenario unfolds:

time →
batch1 (records A,B) sent ──► broker: rejected by transient error (awaiting retry)
batch2 (records C,D) sent ──► broker: success! (appended to log first)
batch1 retried ────────────► broker: success (appended later)
 
resulting partition log: C, D, A, B   ← A,B pushed behind C,D (order broken!)

While batch1 (sent first) is being retried after a transient error, batch2 (sent later) succeeds first. As a result, the later records get appended before the earlier ones. This reordering can occur when enable.idempotence=false and max.in.flight > 1.

The fix: the idempotent producer

The fix is the idempotent producer. With enable.idempotence=true, the producer attaches a sequence number and a producer ID (PID) to each record. The broker inspects the sequence number and rejects out-of-order batches, preventing reordering. As a result, order is preserved even with up to 5 in-flight requests plus retries (this is the core feature introduced by KIP-98).

Configuration	Order guaranteed	Throughput	Recommendation
`enable.idempotence=true` (in.flight ≤ 5)	✅ Yes	High	⭐ Recommended (default)
`enable.idempotence=false`, `max.in.flight=1`	✅ Yes	Low (serialized)	Fallback
`enable.idempotence=false`, `max.in.flight>1`, `retries>0`	❌ Can break	High	⚠️ Risky

# Recommended config — secures both order and throughput
enable.idempotence=true
acks=all
max.in.flight.requests.per.connection=5
retries=2147483647

Kafka 3.0+ default note: Since Kafka 3.0, the default for enable.idempotence is true. So idempotence is often on without you setting anything. But lowering acks to 1 or 0, raising max.in.flight above 5, or disabling retries with retries=0 can disable or conflict with idempotence — check them together.

3. Partitioning and keys — designing the scope of order

The scope of your ordering guarantee ultimately comes down to how you choose the partitioning key.

Key choice = unit of ordering

Order processing: key = orderId → events of one order, in order
User activity log: key = userId → one user's actions, in order
Account transactions: key = accountId → one account's deposits/withdrawals, in order
IoT sensors: key = deviceId → one device's readings, in order

The point is to make the key the unit that must stay ordered. Too coarse (e.g., key = a single fixed constant) and all data piles into one partition, killing parallelism; too fine (e.g., key = a fresh UUID each time) and per-key order becomes meaningless.

What repartitioning breaks

Here lies the most dangerous operational trap. When throughput grows and you increase the partition count (say 6 → 12), the divisor in hash(key) % numPartitions changes. As a result:

A key that used to go to partition 3 may now go to partition 9. That is, a key's old records (partition 3) and new records (partition 9) get scattered across different partitions.

From that point on, per-key ordering for that key breaks. The consumer is no longer guaranteed any order between the old events on partition 3 and the new events on partition 9.

With 6 partitions:   hash("order-1234") % 6 = 3  → partition 3
Scaling to 12:       hash("order-1234") % 12 = 9 → partition 9
                     ↑ same key, different partition! old/new records split

Countermeasures:

Provision partitions generously from the start (you can only increase, never decrease).
If you truly must scale, migrate ordering-critical topics to a new topic, or carve out a quiescent window with no in-flight events for the affected keys at the moment of scaling.
Pin the key→partition mapping with a custom partitioner (e.g., an explicit mapping table) to control the impact of scaling.

4. The consumer side — don't break the processing order

Even when the producer preserves order perfectly, the consumer can break it.

A single consumer processes a partition in order

The basic principle is simple. A single consumer poll()s its assigned partitions in offset order. And within a consumer group, a partition is only ever assigned to exactly one consumer at a time. So as long as "partition → consumer" is fixed 1:1, that partition's order is preserved.

Partition 0 ─────► Consumer A   (only A reads partition 0)
Partition 1 ─────► Consumer B
Partition 2 ─────► Consumer B   (one consumer can own multiple partitions)
 
✅ Two consumers never read the same partition at once (within a group)

Where multithreading breaks order

The trap is when the consumer splits received records across multiple threads for processing. poll() hands records in order, but once you throw them at a thread pool, completion order is no longer guaranteed.

// ❌ Anti-pattern — randomly dispatch same-partition records to a thread pool → order breaks
for (ConsumerRecord<String, String> record : records) {
    executor.submit(() -> process(record)); // C may finish before A
}
 
// ✅ Pin workers by key to preserve per-key order
//    same key always goes to the same worker thread queue → serial processing
int worker = Math.abs(record.key().hashCode()) % numWorkers;
workerQueues.get(worker).put(record);

If order matters, records of the same partition (or same key) must be processed serially on a single thread, or you must pin workers by key as above. If you want to parallelize for throughput, the right model is "parallel across keys, serial within a key."

5. The traps that silently break order

If order breaks even with everything configured correctly, suspect the "invisible" scenarios below. They throw no errors and break order silently.

Trap	What breaks	Mitigation
DLQ / retry topic	Failed messages sent to a retry topic and reprocessed later → leaves original order	If per-key order matters, don't divert to a DLQ; pause/block-retry the whole key
Reprocessing	Rewinding offsets to reprocess mixes with already-processed later events	Design idempotent consumers, isolate the reprocessing window
Multiple producers to the same key	Two producers sending the same key concurrently → order between them is nondeterministic	Guarantee a single producer per key (partition ownership)
Async consumer handoff	Passing records to a separate queue/actor/event loop after `poll` reverses processing order	Per-key serial queues, order-preserving handoff
Cross-topic routing	Routing topic A → topic B with different partition mappings splits order	Keep same key & same partition count; use a single topic when order-dependent

A closer look at the DLQ trap

The most commonly hit trap is the DLQ (Dead Letter Queue). The pattern "on processing failure, send to the DLQ and move on to the next message" is great for throughput, but it is a design that explicitly gives up ordering.

Partition: [A fails] [B] [C]
           A → diverted to DLQ, B & C processed normally
           Later A is reprocessed from the DLQ → after B,C already processed
 
If same key: A(create) fails → B(update),C(delete) processed first → state corruption!

If order matters for the business for a given key, don't divert to a DLQ on failure — stop at that message and retry (blocking retry). Consciously choose between "move forward and skip" versus "stop and retry" based on your per-key ordering requirements.

6. The "global ordering" myth

Occasionally a requirement lands for "complete total order across the whole topic." In Kafka there is exactly one way to achieve this — use a single partition.

With one partition, all records pile serially into one log, so perfect global order is guaranteed. But the cost is steep.

Aspect	Single partition (global order)	Multiple partitions (per-key order)
Ordering scope	Entire topic	Per key only
Parallelism	❌ None (only 1 consumer effective)	✅ Up to partition count
Throughput	Low (single broker, single consumer limit)	High (scales horizontally)
Scalability	❌ Cannot scale	✅ Scale by adding partitions

Practical advice: A "we need global order" requirement can usually be redefined as "we need order per a specific entity." It's rare that all events of a topic truly must be sorted into a single line. Before reaching for a single partition, first check whether a well-chosen key and per-key ordering suffice. A single partition gives up parallelism entirely, and its throughput ceiling becomes the limit of your whole system.

7. The ordering guarantee at a glance

Here's everything so far in one diagram. Records with the same key gather into the same partition and preserve order, but order across partitions is undefined.

Loading diagram…

Inside partition 0: a1 → a2 → a3 order preserved
Inside partition 1: no key A, B and C are interleaved, but within each key (b1→b2, c1→c2→c3) order is preserved
Between partition 0 and partition 1: whether a2 or c1 comes first is undefined

Wrapping up — closing out the 10-part series

Ordering is the most often misunderstood and most silently broken topic in Kafka operations. Let's recap the essentials once more.

Order is per partition. Topic-wide order is not guaranteed.
Same key → same partition → per-key order. Pick the key as "the unit where order matters."
On the producer, enable.idempotence=true prevents reordering (safe even with 5 in-flight + retries).
Adding partitions changes the key→partition mapping and breaks existing keys' order.
On the consumer, keep per-key serial processing. Multithreading and async handoff are the traps.
DLQs, reprocessing, and multiple producers silently break order. Choose the trade-off consciously.
If you truly need global order, a single partition is the only way, and you give up parallelism entirely.

Full recap of the "Kafka Ops Troubleshooting" series

Part	Topic	Key message
1	Diagnosing consumer lag	Lag is a symptom, not a cause — separate throughput/rebalance/partition skew
2	Rebalance storms	Cut needless rebalances with session timeout, `poll` interval, static membership
3	Producer throughput tuning	Balance throughput and latency with `batch.size`, `linger.ms`, compression
4	`acks` and durability	Prevent data loss with `acks=all` + `min.insync.replicas`
5	ISR and under-replication	A shrinking ISR is a durability risk — trace the root cause of replication lag
6	Disk & retention management	Prevent disk blowups with retention, segments, and log compaction
7	Exactly-once semantics (EOS)	Implement duplicate/loss-free processing with transactions and idempotence
8	Monitoring and alerting	Detect failures proactively, not after the fact, with JMX metrics
9	Schema evolution & compatibility	Evolve without breaking compatibility via the schema registry
10	Ordering guarantees	Order is per partition — protect it with key design and idempotence

One operational theme runs through all 10 parts: "Understand Kafka's default behavior precisely, and consciously design the boundaries of its guarantees." Lag, rebalances, and ordering — every incident starts from the vague expectation that "Kafka will just handle it." Know the boundaries, and you can design safely within them.

Coming next — the "Building Kafka DR" series

As we close the troubleshooting series, here's a teaser for the sibling series one step further: "Building Kafka DR (Disaster Recovery)." MirrorMaker 2-based multi-cluster replication, Active-Active vs. Active-Passive topologies, consumer offset synchronization, RPO/RTO design, and region-failure failover drills — going beyond a single cluster to a Kafka that survives even in a disaster. See you in the next series.

References

Apache Kafka Documentation — Message Delivery Semantics: https://kafka.apache.org/documentation/#semantics

Apache Kafka Documentation — Producer Configs (enable.idempotence, max.in.flight.requests.per.connection, acks): https://kafka.apache.org/documentation/#producerconfigs

KIP-98 — Exactly Once Delivery and Transactional Messaging: https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging

Apache Kafka Documentation — Design (Partitioning & Ordering): https://kafka.apache.org/documentation/#design

— The Data Dynamics Engineering Team