Designing Idempotent APIs That Survive Retries
Networks fail, clients retry, and duplicate requests happen. Here's how to design write endpoints that produce the same result no matter how many times they're called.
The Problem
A client calls POST /payments to charge a customer. The request succeeds on the
server, but the response never makes it back — a dropped connection, a timeout, a
load balancer that gave up. The client, seeing no response, does the only sensible
thing: it retries. Now you've charged the customer twice.
This is not an edge case. At scale, it is a daily occurrence. Any endpoint that mutates state needs an answer for "what happens when this runs twice?"
Why It Matters
The network is unreliable by design. Clients retry, proxies retry, job queues retry. If your write endpoints aren't idempotent, every retry is a potential double-charge, duplicate order, or corrupted balance. Idempotency is what lets the rest of your system retry freely without you holding your breath.
Core Concepts
An operation is idempotent if performing it multiple times has the same effect
as performing it once. GET, PUT, and DELETE are idempotent by definition in
HTTP; POST is not. The fix for unsafe POSTs is an idempotency key: a unique
token the client generates and sends with the request.
The server records the key the first time it sees it, along with the result. If the same key arrives again, the server returns the stored result instead of doing the work a second time.
Implementation
The client sends a unique key per logical operation:
POST /payments HTTP/1.1
Idempotency-Key: 7c9e6679-7425-40de-944b-e07fc1f90ae7
Content-Type: application/json
{ "amount": 4999, "currency": "usd", "customer": "cus_123" }
The server wraps the work in a transaction keyed on that token:
async function createPayment(key: string, body: PaymentInput) {
return db.transaction(async (tx) => {
// Reserve the key. The unique constraint makes this the concurrency gate.
const existing = await tx.idempotencyKeys.findUnique({ where: { key } });
if (existing) {
if (existing.status === "completed") return existing.response;
// A request with this key is still in flight.
throw new ConflictError("Request already in progress");
}
await tx.idempotencyKeys.create({ data: { key, status: "pending" } });
const payment = await chargeCustomer(body);
await tx.idempotencyKeys.update({
where: { key },
data: { status: "completed", response: payment },
});
return payment;
});
}
The unique constraint on key is doing the heavy lifting: two concurrent requests
with the same key can't both insert, so only one does the work.
Common Mistakes
- Letting the server generate the key. It must come from the client, before the first attempt, so retries reuse the same value.
- Storing the key after doing the work. If the process crashes in between, the retry runs the work again. Reserve the key first, in the same transaction.
- Ignoring in-flight requests. Two retries can race. Return a
409for a key that's stillpendingrather than running the operation twice. - Keys that live forever. Expire them (24 hours is common) so storage doesn't grow without bound.
Production Considerations
Persist keys in the same datastore as the operation so they share a transaction boundary. If they live in a separate cache, a crash between "charge" and "record key" reopens the double-execution window you were trying to close. Add a TTL and a background sweep for expired keys.
Security
Scope idempotency keys to the authenticated caller. If keys are global, one tenant can probe another's keyspace and read back stored responses. Namespace them by API key or account id.
Performance
The happy path adds one indexed lookup and one insert — negligible. The unique
index on key is essential; without it, the "check then insert" pattern has a race
that lets duplicates through under load.
Summary
Idempotency turns a fragile write into one you can retry without fear. Have clients send an idempotency key, reserve it in the same transaction as the work, store the result, and replay it on duplicates. Once your writes are idempotent, retries stop being scary and your whole system gets more resilient.
The weekly engineering digest
Production-grade engineering writing in your inbox. No spam, unsubscribe anytime.