Handling 429 Too Many Requests in React

Q: Should I handle 429 in a hook or an Axios interceptor?

Use an interceptor for transport-level concerns like retry scheduling, Retry-After parsing, and replay so every call site inherits the behavior. Use a hook for component-level UX state such as isRateLimited and countdowns. They are complementary: the interceptor decides when to retry, the hook decides what the user sees.

Q: Does React Query's retry option handle 429 correctly?

Not by default. React Query retries on a fixed retryDelay and ignores Retry-After. Override retry to only retry on a 429 status within a cap, and override retryDelay to read the server's Retry-After before falling back to exponential backoff, otherwise you re-trigger the limit and stack 429s.

HTTP 429 Too Many Requests signals explicit backend throttling, and the exact task here is wiring a React app to respect that signal — through hooks, interceptors, and UI state — rather than letting naive handling degrade into cascading failures, wasted compute, and lost user trust. This guide sits under Handling 429 HTTP Responses, which establishes the protocol-level contract these React patterns implement. Treating 429s as generic network errors and implementing fixed-interval polling violates core protocol semantics; instead, deterministic retry logic must align with server-provided rate-limit signals to synchronize client behavior with backend capacity windows, respecting explicit cooldown directives rather than aggressively probing throttled endpoints.

Consider a realistic shape: a dashboard that fans out 6–8 parallel queries on mount against an API capped at 60 requests/minute per key. A single navigation can exhaust the window in under a second, and every component that retries blindly turns one 429 into a sustained throttling loop. The patterns below serialize that burst, honor the server’s cooldown, and keep the interface responsive throughout.

Header Parsing & Rate Limit Signal Extraction

Effective 429 handling begins with precise extraction and normalization of rate-limit headers. Backend systems typically emit Retry-After, X-RateLimit-Reset, and RateLimit-Remaining (per RFC 6585 and IETF draft standards). React data-fetching layers must parse these values deterministically before scheduling retries.

Retry-After: Accepts either an integer (seconds) or an HTTP-date string. Must be normalized to milliseconds relative to Date.now().
RateLimit-Remaining: Indicates available quota in the current window. Values <= 1 should trigger proactive queue suspension.
X-RateLimit-Reset / RateLimit-Reset: Unix epoch timestamp marking window reset. Requires timezone-agnostic subtraction to compute exact cooldown duration.

Normalization logic should be centralized to prevent drift across parallel queries. Parsing failures or missing headers must default to safe exponential backoff rather than immediate retry.

Exponential Backoff with Jitter Implementation

When explicit headers are absent or exhausted, exponential backoff with randomized jitter prevents synchronized retry storms (thundering herd effect). The following production-ready TypeScript hook intercepts fetch responses, calculates dynamic delays, and invalidates data caches upon successful resolution.

// useRateLimitedFetch.ts
import { useState, useCallback } from 'react';
import { useQueryClient } from '@tanstack/react-query';

export interface RateLimitConfig {
 maxRetries: number;
 baseDelayMs: number;
 maxDelayMs: number;
 jitterFactor: number; // 0.0 to 1.0
}

const DEFAULT_CONFIG: RateLimitConfig = {
 maxRetries: 3,
 baseDelayMs: 1000,
 maxDelayMs: 30000,
 jitterFactor: 0.25,
};

export function useRateLimitedFetch(config: Partial<RateLimitConfig> = {}) {
 const [isRateLimited, setIsRateLimited] = useState(false);
 const queryClient = useQueryClient();
 const cfg = { ...DEFAULT_CONFIG, ...config };

 const calculateDelay = (attempt: number, retryAfterHeader?: string): number => {
 if (retryAfterHeader) {
 const parsed = parseInt(retryAfterHeader, 10);
 return isNaN(parsed) ? cfg.baseDelayMs : parsed * 1000;
 }
 const exponential = Math.min(cfg.baseDelayMs * Math.pow(2, attempt), cfg.maxDelayMs);
 const jitter = exponential * cfg.jitterFactor * (Math.random() * 2 - 1);
 return Math.max(100, Math.round(exponential + jitter));
 };

 const executeWithRetry = useCallback(async (url: string, options?: RequestInit) => {
 let attempt = 0;
 while (attempt <= cfg.maxRetries) {
 try {
 const response = await fetch(url, options);
 if (response.status === 429) {
 setIsRateLimited(true);
 const retryAfter = response.headers.get('Retry-After') || undefined;
 const delay = calculateDelay(attempt, retryAfter);
 await new Promise(resolve => setTimeout(resolve, delay));
 attempt++;
 continue;
 }
 setIsRateLimited(false);
 queryClient.invalidateQueries({ queryKey: [url] });
 return response;
 } catch (error) {
 throw error;
 }
 }
 throw new Error('Max retries exceeded for rate-limited request');
 }, [cfg, queryClient]);

 return { executeWithRetry, isRateLimited };
}

For Axios-based architectures, a response interceptor provides transparent retry orchestration without modifying individual call sites.

// axios-rate-limit-interceptor.ts
import axios, { AxiosInstance } from 'axios';

export function setupRateLimitInterceptor(instance: AxiosInstance) {
 instance.interceptors.response.use(
 (response) => response,
 async (error) => {
 const originalRequest = error.config;
 if (error.response?.status === 429 && !originalRequest._retry) {
 originalRequest._retry = true;
 const retryAfter = error.response.headers['retry-after'];
 const rateLimitReset = error.response.headers['x-ratelimit-reset'];
 const now = Date.now();
 let cooldownMs = 1000;

 if (retryAfter) {
 cooldownMs = /^\d+$/.test(retryAfter)
 ? parseInt(retryAfter, 10) * 1000
 : new Date(retryAfter).getTime() - now;
 } else if (rateLimitReset) {
 cooldownMs = Math.max(0, parseInt(rateLimitReset, 10) * 1000 - now);
 }

 const jitter = Math.random() * 500;
 await new Promise(resolve => setTimeout(resolve, cooldownMs + jitter));
 return instance(originalRequest);
 }
 return Promise.reject(error);
 }
 );
}

State Synchronization & UI Feedback Patterns

429 responses must be mapped to explicit React state machines rather than generic error boundaries. During cooldown windows, interactive components should be disabled, pending mutations queued, and non-blocking progress indicators rendered. Integrating broader Frontend Resilience & UX Handling principles ensures that API degradation translates to graceful UI adaptation rather than perceived application failure.

A centralized request queue manager serializes outbound calls, preventing concurrent bursts from exhausting remaining quota.

// request-queue-manager.ts
export interface QueuedRequest {
 executor: () => Promise<any>;
 resolve: (value: any) => void;
 reject: (reason?: any) => void;
}

export class RequestQueue {
 private queue: QueuedRequest[] = [];
 private isProcessing = false;
 private concurrencyLimit: number;
 private isPaused = false;

 constructor(concurrencyLimit = 1) {
 this.concurrencyLimit = concurrencyLimit;
 }

 enqueue(executor: () => Promise<any>): Promise<any> {
 return new Promise((resolve, reject) => {
 this.queue.push({ executor, resolve, reject });
 if (!this.isPaused) this.processQueue();
 });
 }

 evaluateRateLimit(remaining: number, resetTimestamp: number): void {
 if (remaining <= 1) {
 this.isPaused = true;
 const waitMs = Math.max(0, resetTimestamp * 1000 - Date.now());
 setTimeout(() => this.resume(), waitMs);
 }
 }

 resume() {
 this.isPaused = false;
 this.processQueue();
 }

 private async processQueue() {
 if (this.isProcessing || this.isPaused || this.queue.length === 0) return;
 this.isProcessing = true;

 const batch = this.queue.splice(0, this.concurrencyLimit);
 await Promise.allSettled(
 batch.map(async ({ executor, resolve, reject }) => {
 try { resolve(await executor()); }
 catch (err) { reject(err); }
 })
 );

 this.isProcessing = false;
 this.processQueue();
 }
}

export const requestQueue = new RequestQueue(1);
export const enqueue = (executor: () => Promise<any>) => requestQueue.enqueue(executor);
export const processQueue = () => requestQueue.resume();

Circuit Breakers & Graceful Degradation

When consecutive 429s exceed a defined threshold, the client must trip a circuit breaker to halt outbound traffic entirely. This prevents resource exhaustion and shifts the application into a degraded-but-functional state.

Implementation strategy:

State Tracking: Maintain a sliding window or counter for consecutive 429s per endpoint.
Trip Condition: After 3 consecutive 429s within a 60-second window, transition to OPEN state.
Fallback Routing: Serve cached/stale data via stale-while-revalidate patterns. Disable mutation triggers (POST/PUT/DELETE).
User Messaging: Render explicit, non-alarming UI components indicating temporary service constraints, optionally including a countdown to the next retry window.
Half-Open Transition: After the calculated reset window expires, allow a single probe request. Success transitions to CLOSED; failure re-trips the breaker.

Failure Mode Analysis

Scenario	Impact	Mitigation
Missing or Malformed Retry-After Header	Client cannot calculate exact cooldown, leading to premature retries and secondary 429 cascades.	Implement a safe fallback exponential backoff (base: 1000ms, max: 30000ms) with ±25% randomized jitter. Log header absence for backend telemetry.
Concurrent Request Burst on Mount	Multiple `useEffect` or parallel queries exhaust the rate limit window simultaneously, triggering immediate throttling.	Implement request deduplication and a centralized queue manager. Serialize outbound calls when remaining quota drops below 2.
Excessive Cooldown Periods (>60s)	UI appears frozen or broken, causing user abandonment or manual refresh loops.	Activate circuit breaker state. Serve cached/stale data, display a countdown timer to next retry, and disable mutation triggers until the window resets.
CDN/WAF Layer Throttling vs Application Layer	Application-level headers are stripped or overridden by edge proxies, breaking client-side parsing logic.	Detect edge-specific headers (e.g., `CF-RateLimit`, `X-Edge-Status`). Implement a dual-parsing strategy that prioritizes application headers but falls back to edge directives when available.

Verification & Testing

Drive a synthetic burst against a mocked endpoint and assert that the hook serializes calls and honors the cooldown rather than hammering the server.

Mock the endpoint to return 429 with Retry-After: 2 for the first two calls, then 200 Mock the endpoint to return `429` with `Retry-After: 2` for the first two calls, then `200`.
Assert the component renders a disabled control with a visible countdown during the cooldown.
Assert no more than one in-flight request per endpoint while isRateLimited Assert no more than one in-flight request per endpoint while `isRateLimited` is true.
Assert queryClient.invalidateQueries Assert `queryClient.invalidateQueries` fires exactly once on eventual success.
Confirm an absent Retry-After Confirm an absent `Retry-After` falls back to capped exponential backoff, not immediate retry.

# Drive 8 parallel mount-time requests and confirm the accepted count
# tracks the 60 rpm window rather than spiking to 8 immediate hits.
hey -n 8 -c 8 -H "X-API-Key: acct_42" http://localhost:5173/api/v1/dashboard \
  | grep -E "Status code distribution" -A 4

Frequently Asked Questions

Should I handle 429 in a hook or an Axios interceptor?

Use an interceptor for transport-level concerns — retry scheduling, Retry-After parsing, and replay — so every call site inherits the behavior without changes. Use a hook for component-level UX state like isRateLimited and countdowns. They are complementary: the interceptor decides when to retry, the hook decides what the user sees.

Does React Query's retry option handle 429 correctly?

Not by default. React Query retries on a fixed-ish retryDelay and ignores Retry-After. Override retry to only retry on error.status === 429 within a cap, and override retryDelay to read the server's Retry-After before falling back to exponential backoff. Otherwise you re-trigger the limit and stack 429s.

How do I stop concurrent queries from exhausting the window on mount?

Route outbound calls through a single shared queue with a concurrency limit of 1–2 and pause it when X-RateLimit-Remaining drops to 1. Deduplicate identical in-flight requests by a method+url+payload key so a re-render does not double-fire the same call.

What should the UI show during a long cooldown?

Render a non-alarming, accessible status with a live countdown — a disabled submit button labeled with the remaining seconds and aria-live="polite" — and serve cached or stale data where possible. Above roughly 60 seconds, trip a circuit breaker and surface an explicit retry affordance instead of an indefinitely frozen control.

Handling 429 HTTP Responses — the parent topic and protocol-level handling contract.
Frontend Resilience & UX Handling — the full client-side resilience overview.
Retry Queues in Axios Interceptors — the transport-layer queue these hooks pair with.
Exponential Backoff UX — turning the cooldown into honest interface feedback.
Retry-After Parsing — the seconds-vs-HTTP-date normalization the hook depends on.