Client-Side Rate-Limit State
The cheapest 429 is the one you never send — and a client that reads the RateLimit-Remaining header on every response can throttle itself before it trips the limit instead of reacting after. This guide sits in the frontend resilience and UX handling area and covers the client-side state machine: parsing the remaining/limit/reset headers into a local quota model, gating outgoing requests against it, surfacing the number to the user, and keeping that state coherent across multiple browser tabs hitting the same API.
Reactive retry (backoff, Retry-After) handles the limit after you hit it. Proactive client state is the complementary half: it spaces requests so the limit is hit far less often.
Mechanism: the local quota model
Every rate-limited response carries enough to reconstruct the server’s view of your budget. The client keeps a small, per-key record and updates it on each response:
remaining— requests left in the current window (fromRateLimit-Remaining/X-RateLimit-Remaining).limit— the window’s ceiling (RateLimit-Limit).resetAt— epoch ms whenremainingrefills (RateLimit-Reset, normalized).cooldownUntil— epoch ms set when a429lands, cleared after.
The invariant: never send when remaining <= reserve and now < resetAt, where reserve is a small safety margin. Each update is O(1); the whole model is a handful of numbers per API key, cheap enough to live in memory and mirror to storage.
Configuration reference
| Param | Type | Default | Range | Effect |
|---|---|---|---|---|
reserve |
number | 2 |
0 – 10 | Stop sending while this many requests remain — a safety margin against races |
deferStrategy |
enum | until-reset |
block / until-reset / drop | What to do when the gate closes |
share |
enum | broadcast |
none / storage / broadcast | How tabs share the quota record |
storageKey |
string | rl:state |
— | localStorage/BroadcastChannel namespace |
staleMs |
number | 60000 |
1 000 – 600 000 | Treat a stored record older than this as unknown |
epochSniffThreshold |
number | 1e9 |
— | Above this, a -Reset value is an absolute epoch |
Implementation walkthrough
A small store updated on every response, queried before every send. Mirror it to localStorage so a new tab starts warm, and broadcast updates so open tabs stay coherent.
// rl-state.ts — proactive client-side quota tracking shared across tabs.
interface Quota { remaining: number; limit: number; resetAt: number; cooldownUntil: number; }
const KEY = "rl:state";
const RESERVE = 2;
let q: Quota = load() ?? { remaining: Infinity, limit: Infinity, resetAt: 0, cooldownUntil: 0 };
const bc = "BroadcastChannel" in self ? new BroadcastChannel(KEY) : null;
bc?.addEventListener("message", (e) => { q = e.data as Quota; }); // adopt peer updates
function load(): Quota | null {
try { return JSON.parse(localStorage.getItem(KEY) ?? "null"); } catch { return null; }
}
function persist() {
localStorage.setItem(KEY, JSON.stringify(q)); // mirror for new tabs + storage event
bc?.postMessage(q); // push to already-open tabs
}
// Call after every response — keeps the local model in sync with the server.
export function ingest(res: Response) {
const rem = res.headers.get("ratelimit-remaining") ?? res.headers.get("x-ratelimit-remaining");
const lim = res.headers.get("ratelimit-limit") ?? res.headers.get("x-ratelimit-limit");
const reset = res.headers.get("ratelimit-reset") ?? res.headers.get("x-ratelimit-reset");
if (rem != null) q.remaining = Number(rem);
if (lim != null) q.limit = Number(lim);
if (reset != null) {
const n = Number(reset);
q.resetAt = n > 1e9 ? n * 1000 : Date.now() + n * 1000; // epoch vs seconds-left
}
if (res.status === 429) q.cooldownUntil = q.resetAt || Date.now() + 1000;
persist();
}
// Call before every send — returns ms to wait, or 0 if clear to go now.
export function gateMs(): number {
const now = Date.now();
if (now < q.cooldownUntil) return q.cooldownUntil - now; // hard 429 cooldown
if (q.remaining <= RESERVE && now < q.resetAt) return q.resetAt - now; // proactive defer
return 0;
}
Surfacing quota in the UI
The number you already track is the number the user wants to see. Bind remaining/limit to a badge (“42 / 100 this minute”) and resetAt to a countdown. When the gate is closed, disable the action and show “resets in 18s” rather than letting the click fail with a 429. This reuses the same disable/countdown machinery described in exponential backoff and UX.
| State | Gate result | UI |
|---|---|---|
| Plenty left | 0 |
Control enabled, badge shows count |
| Near reserve | 0 (still sends) |
Badge turns amber |
| At/below reserve | resetAt − now |
Control disabled, “resets in Ns” |
In 429 cooldown |
cooldownUntil − now |
Control disabled, cooldown countdown |
Distributed across tabs
Five tabs of the same app share one server-side budget but, by default, five independent client models — so each can think it has full quota and collectively blow it. Sharing the record makes the client honest. The mechanism (BroadcastChannel vs storage events vs SharedWorker), the SSR/iframe caveats, and the races are covered in persisting rate-limit state across tabs.
Failure modes & mitigations
- Optimistic over-send across tabs. Without sharing, N tabs each spend the full budget. Mirror to storage and broadcast updates.
- Stale record. A record from an old window over-restricts. Treat anything older than
staleMs, or pastresetAt, as unknown and probe. - Header absent. Not every endpoint emits
RateLimit-*. Degrade to reactive handling and theRetry-Afterpath from Retry-After parsing. - Clock skew on epoch reset. An absolute
-Resetdifferenced against a wrong client clock mis-times the gate; the seconds-remaining form avoids it. - Reserve too small. Concurrent in-flight requests can each pass the gate before any response lands. A
reserveof 1–2 absorbs that race.
Child topics
- Persisting Rate-Limit State Across Tabs — sharing a cooldown and remaining quota across tabs via BroadcastChannel, localStorage, and storage events, with the SSR/iframe and race caveats.
Related
- Frontend Resilience & UX Handling — the parent area for client-side rate-limit behavior.
- Retry-After Parsing — the reactive complement when you do hit a 429.
- Exponential Backoff & UX — disable/countdown UX shared with the gate.
- Handling 429 HTTP Responses — the broader response-handling guide.