When to Use Sliding Log Over Token Bucket

Selecting a rate-limiting algorithm requires balancing mathematical precision against infrastructure constraints. For platform teams and backend engineers, the decision between a sliding log and a token bucket hinges on whether exact request accounting or burst-tolerant throughput is the primary architectural driver.

Architectural Decision Matrix: Precision vs. Throughput

Before selecting a throttling strategy, engineers must evaluate whether strict compliance or high-throughput performance is the primary constraint. Foundational trade-offs are detailed in Core Rate Limiting Algorithms & Theory, which outlines why fixed-window approximations fail under strict SLA requirements.

The sliding log algorithm maintains a precise, timestamped record of every request within a rolling window. Unlike token buckets, which approximate consumption through a leaky bucket model and permit controlled bursts, sliding logs enforce strict, continuous boundaries. This architectural choice is mandated when:

Exact request counting is required for billing or compliance: Financial reconciliation and regulatory audit trails demand deterministic, non-approximate request tallies.
Rolling window boundaries must eliminate edge-case spikes: Token buckets and fixed windows inherently allow boundary doubling (e.g., 2x requests at window edges). Sliding logs mathematically eliminate this by evaluating a continuous time horizon.
Infrastructure budget supports O(N) memory scaling: Each request consumes a discrete entry in the backing store. If your platform can absorb linear memory growth in exchange for absolute accuracy, the sliding log is the correct primitive.

Exact Configuration & Implementation Patterns

Production-grade sliding log counters require atomic execution to prevent race conditions and ensure accurate eviction. The following Redis/Lua implementation provides a deterministic, single-pass evaluation pattern.

Storage & Key Architecture

Storage Backend: Redis Sorted Set (ZADD/ZREMRANGEBYSCORE)
Key Schema: rl:sliding:{tenant_id}:{route_hash}
Member Uniqueness: timestamp:random_suffix (prevents collision on concurrent requests)
Cleanup Strategy: Lazy eviction on read (ZREMRANGEBYSCORE executed during quota check)

-- Atomic sliding log evaluation
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

-- Evict expired entries outside the rolling window
redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, now - window)

-- Count active requests in the current window
local count = redis.call('ZCARD', KEYS[1])

-- Enforce limit
if count >= limit then
 return 0
end

-- Log new request with unique member
redis.call('ZADD', KEYS[1], now, now .. ':' .. math.random(100000))
return 1

Configuration Parameters:

Parameter	Value	Purpose
`window_ms`	`60000`	Rolling evaluation window (1 minute)
`max_requests`	`100`	Hard quota threshold per window
`member_uniqueness`	`timestamp:random_suffix`	Guarantees sorted set cardinality matches request count
`cleanup_strategy`	`lazy_eviction_on_read`	Defers memory reclamation to quota evaluation cycles

The sorted set approach guarantees exact cardinality within the rolling window, though memory overhead scales linearly. Implementation benchmarks and memory optimization patterns are documented in Sliding Log Counters.

Failure Mode Analysis & Mitigation

Failure Mode	Trigger	Impact	Mitigation
Memory Exhaustion	High-traffic endpoints with extended rolling windows	Redis OOM eviction bypasses rate limiting entirely	Enforce strict `maxmemory-policy`, implement shard-by-tenant routing, or cap sorted set cardinality with hard TTL fallback
Clock Skew & Distributed Desync	Multi-region deployments without synchronized NTP	Inaccurate window boundaries cause false 429s or quota leaks	Always derive timestamps via Redis `TIME` command; never trust application server clocks
P99 Latency Amplification	Synchronous `ZCARD` + `ZADD` execution per request	Increased tail latency under burst traffic	Pipeline commands, implement async batch logging, or deploy read-replicas for quota checks

Validation & Production Rollout Protocol

Migrating from token bucket to sliding log requires rigorous shadow testing to verify quota accuracy without disrupting live traffic. Follow this phased rollout methodology:

Deploy in Passive Mode: Run the sliding log counter alongside the existing token bucket implementation. Do not enforce 429 responses; only log evaluation results.
Log Discrepancies: Compare exact sliding log counts against token bucket approximations. Track divergence rates during peak traffic and boundary transitions.
Validate Latency Thresholds: Ensure the lua_script_execution_time_p99 delta remains under 5ms. If latency exceeds thresholds, optimize the Lua script or scale Redis horizontally.
Enable Active Enforcement: Switch to blocking mode only after a 7-day stability window confirms zero false positives and acceptable memory growth curves.

Critical Monitoring Metrics

sorted_set_cardinality_growth_rate: Tracks linear memory consumption per tenant
lua_script_execution_time_p99: Ensures atomic evaluation does not degrade API response times
false_positive_429_ratio: Validates exact quota enforcement accuracy
redis_memory_fragmentation_ratio: Detects underlying allocator inefficiencies during high-churn eviction cycles

Key Takeaways

Sliding log is strictly superior for compliance, billing, and exact quota enforcement. It eliminates boundary artifacts and provides deterministic request accounting.
Token bucket remains optimal for burst-tolerant, memory-constrained edge proxies. Use it when throughput and low-latency approximation outweigh strict counting requirements.
Always implement circuit breakers to degrade gracefully to token bucket on Redis failure. Maintain fallback routing to prevent complete API unavailability during storage outages.