When to Use Sliding Log Over Token Bucket
Selecting a rate-limiting algorithm requires balancing mathematical precision against infrastructure constraints. This decision guide sits under the Sliding Log Counters parent topic and answers one question: whether exact request accounting or burst-tolerant throughput is the primary architectural driver for your endpoint.
The decision tree below routes a workload to a token bucket when bursts are acceptable and memory is tight, and to a sliding log when billing, compliance, or hard quotas demand exact counts.
Architectural Decision Matrix: Precision vs. Throughput
Before selecting a throttling strategy, engineers must evaluate whether strict compliance or high-throughput performance is the primary constraint. Foundational trade-offs are detailed in Core Rate Limiting Algorithms & Theory, which outlines why fixed-window approximations fail under strict SLA requirements.
The sliding log algorithm maintains a precise, timestamped record of every request within a rolling window. Unlike token buckets, which approximate consumption through a leaky bucket model and permit controlled bursts, sliding logs enforce strict, continuous boundaries. This architectural choice is mandated when:
- Exact request counting is required for billing or compliance: Financial reconciliation and regulatory audit trails demand deterministic, non-approximate request tallies.
- Rolling window boundaries must eliminate edge-case spikes: Token buckets and fixed windows inherently allow boundary doubling (e.g., 2x requests at window edges). Sliding logs mathematically eliminate this by evaluating a continuous time horizon.
- Infrastructure budget supports O(N) memory scaling: Each request consumes a discrete entry in the backing store. If your platform can absorb linear memory growth in exchange for absolute accuracy, the sliding log is the correct primitive.
Exact Configuration & Implementation Patterns
Production-grade sliding log counters require atomic execution to prevent race conditions and ensure accurate eviction. The following Redis/Lua implementation provides a deterministic, single-pass evaluation pattern.
Storage & Key Architecture
- Storage Backend: Redis Sorted Set (
ZADD/ZREMRANGEBYSCORE) - Key Schema:
rl:sliding:{tenant_id}:{route_hash} - Member Uniqueness:
timestamp:random_suffix(prevents collision on concurrent requests) - Cleanup Strategy: Lazy eviction on read (
ZREMRANGEBYSCOREexecuted during quota check)
-- Atomic sliding log evaluation
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
-- Evict expired entries outside the rolling window
redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, now - window)
-- Count active requests in the current window
local count = redis.call('ZCARD', KEYS[1])
-- Enforce limit
if count >= limit then
return 0
end
-- Log new request with unique member
redis.call('ZADD', KEYS[1], now, now .. ':' .. math.random(100000))
return 1
Configuration Parameters:
| Parameter | Value | Purpose |
|---|---|---|
window_ms |
60000 |
Rolling evaluation window (1 minute) |
max_requests |
100 |
Hard quota threshold per window |
member_uniqueness |
timestamp:random_suffix |
Guarantees sorted set cardinality matches request count |
cleanup_strategy |
lazy_eviction_on_read |
Defers memory reclamation to quota evaluation cycles |
The sorted set approach guarantees exact cardinality within the rolling window, though memory overhead scales linearly. Implementation benchmarks and memory optimization patterns are documented in Sliding Log Counters.
Failure Mode Analysis & Mitigation
| Failure Mode | Trigger | Impact | Mitigation |
|---|---|---|---|
| Memory Exhaustion | High-traffic endpoints with extended rolling windows | Redis OOM eviction bypasses rate limiting entirely | Enforce strict maxmemory-policy, implement shard-by-tenant routing, or cap sorted set cardinality with hard TTL fallback |
| Clock Skew & Distributed Desync | Multi-region deployments without synchronized NTP | Inaccurate window boundaries cause false 429s or quota leaks | Always derive timestamps via Redis TIME command; never trust application server clocks |
| P99 Latency Amplification | Synchronous ZCARD + ZADD execution per request |
Increased tail latency under burst traffic | Pipeline commands, implement async batch logging, or deploy read-replicas for quota checks |
Validation & Production Rollout Protocol
Migrating from token bucket to sliding log requires rigorous shadow testing to verify quota accuracy without disrupting live traffic. Work the rollout checklist in order:
- Deploy in passive mode: run the sliding log counter alongside the existing token bucket. Do not enforce
429 - Log discrepancies:
- Validate latency thresholds: ensure the
lua_script_execution_time_p99 - Confirm memory growth: verify
sorted_set_cardinality_growth_rateis bounded per tenant and thatmaxmemory-policy - Enable active enforcement:
Critical Monitoring Metrics
sorted_set_cardinality_growth_rate: Tracks linear memory consumption per tenantlua_script_execution_time_p99: Ensures atomic evaluation does not degrade API response timesfalse_positive_429_ratio: Validates exact quota enforcement accuracyredis_memory_fragmentation_ratio: Detects underlying allocator inefficiencies during high-churn eviction cycles
Key Takeaways
- Sliding log is strictly superior for compliance, billing, and exact quota enforcement. It eliminates boundary artifacts and provides deterministic request accounting.
- Token bucket remains optimal for burst-tolerant, memory-constrained edge proxies. Use it when throughput and low-latency approximation outweigh strict counting requirements.
- Always implement circuit breakers to degrade gracefully to token bucket on Redis failure. Maintain fallback routing to prevent complete API unavailability during storage outages.
Frequently Asked Questions
Can a token bucket ever match the accuracy of a sliding log?
Not for exact accounting. A token bucket tracks a single fill level, not individual events, so it deliberately permits bursts up to its capacity and cannot reconstruct how many requests landed in any precise sub-window. If finance or compliance needs to reconcile request-by-request, only a log of timestamps gives you that.
How much more memory does a sliding log actually cost?
It scales with concurrent requests in the window, not with the number of clients alone. A 100-request limit means up to 100 sorted-set members per active key; at tens of thousands of active keys that is meaningful Redis memory, where a token bucket needs only two small fields per key. Budget for it or cap the log length.
Should I derive timestamps from the app server or from Redis?
From Redis, using the TIME command inside the Lua script. App-server clocks drift relative to each other, so across a multi-node fleet they produce inconsistent window boundaries and either false 429s or quota leaks. A single Redis clock keeps every node's evaluation deterministic.
Is a hybrid of both algorithms reasonable?
Yes, and it is common. Use a cheap token bucket or local pre-filter at the edge to shed obvious floods, and reserve the authoritative sliding log for the billing-critical decision. The log only sees traffic that already passed the coarse filter, which keeps its memory growth in check.
Related
- Sliding Log Counters — the parent topic with the full implementation and Redis patterns.
- Sliding Log Memory Footprint Tuning — sizing and capping the log when O(N) bites.
- Token Bucket Implementation — the burst-tolerant alternative this guide weighs against.
- Fixed Window vs Sliding Window — the O(1) approximations between these two extremes.