The Transactional Outbox: Reliable Messaging Without Distributed Transactions
Here is a bug that ships to production constantly. A service handles "place order": it inserts a row into the orders table, commits, and then publishes an OrderPlaced event to Kafka so the warehouse and billing services can react. Most of the time it works. Occasionally the process crashes, or the broker is briefly unreachable, in the half-second between the commit and the publish. The order exists in the database. The event never went out. Billing never charges, the warehouse never ships, and a customer is staring at a confirmation page for an order that, as far as the rest of your company knows, never happened.
This is the dual-write problem: you need to update two systems (your database and your message broker) atomically, but you have no shared transaction across them. The transactional outbox solves it cleanly, and it is one of the highest-leverage patterns in event-driven architecture.
Why you can't just "be careful"
The instinct is to reorder or retry. Publish first, then write? Now you can emit an event for an order that failed to save. Write, then publish with retries? A crash mid-retry still loses the event, and an in-memory retry queue dies with the process. Wrap both in a distributed transaction (XA / two-phase commit)? Most modern brokers don't support it well, it couples your availability to the slowest participant, and operationally it is a nightmare. You cannot make two independent systems commit atomically by trying harder.
The trick is to stop treating the event as a second system at all.
The core idea
Write the event into the same database, in the same transaction as the business data. The event becomes just another row.
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES ('ord_123', 'cust_9', 4200, 'placed');
INSERT INTO outbox (id, aggregate_id, type, payload, created_at)
VALUES ('evt_456', 'ord_123', 'OrderPlaced',
'{"orderId":"ord_123","total":4200}', now());
COMMIT;
Now the order and its event commit together or not at all — a single local transaction, fully ACID, no distributed anything. After the commit, a separate process reads unpublished rows from outbox and forwards them to the broker. If publishing fails, the row is still there to retry. If the process crashes, the next run picks up where it left off. The event cannot be lost, because its existence is guaranteed by the same commit that created the order.
Getting events out of the table
There are two ways to drain the outbox, and the difference matters.
Polling publisher. A background worker periodically queries for unsent rows, publishes them, and marks them sent.
def relay_outbox(db, broker):
rows = db.query(
"SELECT id, type, payload FROM outbox "
"WHERE published_at IS NULL "
"ORDER BY created_at "
"LIMIT 100 FOR UPDATE SKIP LOCKED"
)
for row in rows:
broker.publish(row.type, row.payload, key=row.aggregate_id)
db.execute(
"UPDATE outbox SET published_at = now() WHERE id = %s",
(row.id,),
)
FOR UPDATE SKIP LOCKED lets you run multiple relay workers without them fighting over the same rows. Simple, debuggable, and good enough for most teams. The cost is polling latency and load on the table.
Change Data Capture (CDC). Instead of polling, tail the database's transaction log. Tools like Debezium read the write-ahead log and stream every INSERT into the outbox table to the broker. No polling, lower latency, no query load — but you now operate a CDC pipeline, which is real infrastructure. Use CDC when you already have it or when polling latency is unacceptable; otherwise start with polling.
At-least-once, and why that is fine
The outbox guarantees at-least-once delivery, not exactly-once. The classic gap: the relay publishes the event successfully, then crashes before marking the row as sent. On restart it publishes the same event again. Consumers will occasionally see duplicates.
This is not a flaw to fix; it is a property to design around. Make consumers idempotent. Every event carries a stable ID (evt_456 above), and each consumer records the IDs it has processed:
def handle(event):
if seen.contains(event.id):
return # already processed, ack and move on
process(event)
seen.add(event.id)
Chasing true exactly-once across a network is a tarpit. At-least-once delivery plus idempotent consumers gives you effectively-once behavior with far less complexity, and it degrades gracefully. This pairing is the backbone of reliable event-driven systems — get comfortable with it.
The details that bite
- Cleanup. The outbox table grows forever if you let it. Delete or archive published rows on a schedule. A table with 400 million dead rows makes your
WHERE published_at IS NULLscan crawl. A partial index on unpublished rows helps, but pruning is what keeps it healthy. - Ordering. If consumers need per-entity ordering (all events for
ord_123in order), publish with the aggregate ID as the partition key and process the outbox increated_atorder per aggregate. Global ordering is usually neither needed nor affordable. - Payload shape. Store enough in the payload that consumers don't have to call back into your service to understand the event. An event that says only
{"orderId":"ord_123"}forces every consumer to query you, which reintroduces coupling and load. Include the data the event is about. - Poison messages. A row that fails to publish forever (malformed payload, a broker that rejects it) will block or churn your relay. Add a retry counter and route exhausted rows to a dead-letter table for a human to inspect.
When to reach for it
Use the outbox whenever a single business operation must both change your data and reliably tell the outside world about it: order placement, payment capture, user signup that triggers downstream provisioning, anything where a lost event means two systems silently disagree.
You do not need it when the event is best-effort and a loss is harmless (fire-and-forget analytics pings), or when the consumer can simply re-read state on its own schedule rather than being notified. The pattern costs you a table, a relay process, and idempotent consumers. In exchange, it removes an entire category of "the database says one thing and the queue says another" incidents — the kind that are nearly impossible to reproduce and miserable to debug after the fact. That is a trade worth making the moment correctness matters.
Keep reading
Backpressure: What Happens When Your System Can't Keep Up
A fast producer and a slow consumer is a recipe for an out-of-memory crash. Backpressure is the discipline of letting the slow part tell the fast part to wait. Here is how to design it in.
Database Connection Pooling: The Bottleneck You Forgot to Tune
More connections is not more throughput. Past a point, adding connections makes your database slower. Here is how pools actually work and how to size one without guessing.
Write-Ahead Logging: The Unsung Hero of Database Durability
How does a database survive a power cut mid-write without corrupting your data? The answer is a deceptively simple rule: log the change before you apply it. Here is why WAL is everywhere.
Newsletter
New posts, straight to your inbox
One email per post. No spam, no tracking pixels, unsubscribe anytime.
Comments
- No comments yet. Be the first.