RateLimit Draft vs X-RateLimit Headers

Choosing between the IETF RateLimit draft headers and the legacy X-RateLimit-* triplet is a contract decision: one is a moving standard with structured policy semantics, the other is an unstandardized convention that every client SDK already understands. This task sits under the rate-limit response headers guide, which covers the legacy fields in detail; here the focus is the field-by-field differences, how the draft expresses multiple simultaneous policies, where adoption actually stands, and how to dual-emit both during a migration so you break no existing client. The practical answer for most teams is not “pick one” but “emit both for a deprecation window, then drop the legacy set when your clients have moved.”

The problem in concrete numbers

Say you enforce two policies on the same endpoint: 100 requests/minute and 5000 requests/day. The legacy X-RateLimit-* triplet has no way to express two policies at once — it carries a single Limit/Remaining/Reset, so you must either expose only the most-constrained policy (clients never learn the daily cap until they hit it) or invent ad-hoc extra headers like X-RateLimit-Limit-Day. The IETF RateLimit-Policy header solves exactly this by listing every policy in one structured field. So if your API enforces layered quotas — a common shape for tiered plans — the draft headers carry information the legacy triplet structurally cannot.

Field-by-field comparison

Concern Legacy X-RateLimit-* IETF RateLimit draft
Ceiling X-RateLimit-Limit: 100 limit member in RateLimit: limit=100, ...
Remaining X-RateLimit-Remaining: 42 remaining member: RateLimit: ..., remaining=42, ...
Reset X-RateLimit-Reset: 57 (or epoch) reset member (delta-seconds): ..., reset=57
Wait on 429 Retry-After: 57 Retry-After: 57 (unchanged; reused as-is)
Policy expression None — single implicit policy RateLimit-Policy: 100;w=60 (limit + window)
Multiple policies Not representable RateLimit-Policy: 100;w=60, 5000;w=86400
Reset encoding Epoch or delta (ambiguous historically) Delta-seconds, standardized
Field format Three separate headers, integer values Structured field (RFC 8941) with named members
Standardization None (de-facto convention) IETF draft (draft-ietf-httpapi-ratelimit-headers)
Client SDK support Near-universal Growing; not yet default in most SDKs

The conceptual shift is that the draft uses structured fields: RateLimit is one header whose value is a comma-and-semicolon-delimited set of named members, parsed by a standard grammar rather than three independent integer headers. RateLimit-Policy is the static description of what is enforced (limit and window w in seconds), while RateLimit is the dynamic current state (how much is left right now). Legacy headers blur these two ideas into one triplet.

Legacy single-policy triplet versus the IETF draft policy and state headers The legacy triplet expresses one policy while the IETF draft splits static policy description from dynamic current state and can list multiple policies. Legacy: one policy only X-RateLimit-Limit: 100 X-RateLimit-Remaining: 42 X-RateLimit-Reset: 57 daily cap unrepresentable IETF draft: policy + state RateLimit-Policy (static) 100;w=60, 5000;w=86400 RateLimit (dynamic) limit=100, remaining=42, reset=57 two policies in one field Migration: emit both legacy + draft until clients move

Decision: which to emit

Situation Recommendation
Existing API, many third-party clients Keep legacy; add draft headers; drop legacy only after a long deprecation window
Brand-new API, you control the SDK Emit draft headers; optionally mirror legacy for tooling that expects it
Multiple layered policies (per-minute + per-day) Emit draft RateLimit-Policy — legacy cannot represent this
Behind a gateway you don’t fully control Emit whatever the gateway passes through cleanly; verify it doesn’t strip structured fields
Internal-only API Either; pick the one your internal clients parse and standardize on it

Dual-emitting both header families

The migration-safe approach computes the limiter state once and writes both header families from it. The draft reset is delta-seconds, so align legacy X-RateLimit-Reset to delta-seconds too and the two stay trivially consistent.

// Emit legacy X-RateLimit-* AND IETF RateLimit/RateLimit-Policy from one decision.
// state: { limit, remaining, resetSeconds, allowed } plus the static policies list.
function setRateLimitHeaders(res, state, policies) {
  const reset = Math.max(state.remaining > 0 ? 0 : 1, Math.ceil(state.resetSeconds));

  // --- Legacy triplet (delta-seconds Reset, to match the draft) ---
  res.set("X-RateLimit-Limit", String(state.limit));
  res.set("X-RateLimit-Remaining", String(Math.max(0, state.remaining)));
  res.set("X-RateLimit-Reset", String(reset));

  // --- IETF draft: structured fields ---
  // RateLimit-Policy lists every enforced policy: "<limit>;w=<window-seconds>".
  res.set(
    "RateLimit-Policy",
    policies.map((p) => `${p.limit};w=${p.windowSeconds}`).join(", "),
  );
  // RateLimit carries the current state for the most-constrained policy.
  res.set(
    "RateLimit",
    `limit=${state.limit}, remaining=${Math.max(0, state.remaining)}, reset=${reset}`,
  );

  if (!state.allowed) res.set("Retry-After", String(reset)); // shared by both
}

// Example call for a 100/min + 5000/day endpoint:
setRateLimitHeaders(res, { limit: 100, remaining: 42, resetSeconds: 57, allowed: true }, [
  { limit: 100, windowSeconds: 60 },
  { limit: 5000, windowSeconds: 86400 },
]);

The same pattern in FastAPI is identical in shape — compute the limiter state once (see emitting X-RateLimit headers for the full middleware) and write both header families onto the response object before the body streams.

Gotchas & edge cases

  • The draft is a moving target. It has gone through multiple revisions and the field grammar has changed between drafts (earlier versions used different member names). Pin to a specific draft version in your docs and don’t assume a client parses the exact revision you emit.
  • Structured-field parsing is stricter. RateLimit: limit=100, remaining=42, reset=57 must follow RFC 8941 syntax — members are key=value, comma-separated. A trailing comma or stray space can make a strict parser reject the whole field. Generate it, don’t hand-concatenate ad hoc.
  • Gateways may strip unknown headers. Some proxies pass X-* through but normalize or drop headers they don’t recognize. Verify the draft headers survive the full path from origin to client.
  • RateLimit reflects one policy; RateLimit-Policy lists all. A common mistake is putting every policy’s remaining count into RateLimit. The dynamic RateLimit header describes the single most-constrained policy’s current state; the enumeration belongs in RateLimit-Policy.
  • Don’t emit contradictory resets. If legacy X-RateLimit-Reset is epoch and draft reset is delta, clients reading both see two different “reset” numbers. Align both to delta-seconds during dual-emit.
  • Retry-After is shared, not duplicated. Both families reuse the standard Retry-After; there is no draft-specific retry header. Emit it once.

Verification & testing

Confirm both header families appear and parse cleanly with curl -i.

curl -i -s -H "X-API-Key: acct_42" https://api.example.com/v1/search | grep -i \
  -E "ratelimit|retry-after"
# x-ratelimit-limit: 100
# x-ratelimit-remaining: 42
# x-ratelimit-reset: 57
# ratelimit-policy: 100;w=60, 5000;w=86400
# ratelimit: limit=100, remaining=42, reset=57

Assert in a test that the reset member of RateLimit equals legacy X-RateLimit-Reset, and that RateLimit-Policy parses into the expected list of (limit, window) pairs. When a client moves to the draft headers, you should see legacy-header reads drop in your access logs — that telemetry is your signal that it is safe to retire the legacy triplet.

Frequently Asked Questions

Is the IETF RateLimit header a finalized standard yet?

No. It is an active IETF draft (draft-ietf-httpapi-ratelimit-headers) that has changed across revisions, including the field grammar. Treat it as stable enough to emit but pin the draft version in your docs, and keep emitting the legacy triplet for clients that don't parse the draft.

Can I express two limits (per-minute and per-day) with these headers?

Only with the draft. RateLimit-Policy: 100;w=60, 5000;w=86400 lists both policies in one field. The legacy X-RateLimit-* triplet carries a single policy, so with legacy headers you can only expose the most-constrained one unless you invent non-standard extra headers.

Should I drop X-RateLimit-* once I emit the draft headers?

Not immediately. Most client SDKs still read X-RateLimit-*, so dual-emit both for a deprecation window and watch your logs for legacy-header reads. Drop the legacy triplet only once that traffic has effectively gone to zero.

What's the difference between RateLimit and RateLimit-Policy?

RateLimit-Policy is the static description of what is enforced — each policy's limit and window. RateLimit is the dynamic current state for the most-constrained policy: limit, remaining, and reset right now. One describes the rules; the other reports where you stand against them.