Billing-Critical Sliding-Log Usage

Q: Why not just use a fast approximate counter for billing?

An approximate counter carries plus or minus 0.5 to 1 percent error, which on an 8 million call account is plus or minus 160 dollars a month of over- or under-billing, and it keeps no per-event record to audit or reconcile. Approximation is correct for throttling but wrong for anything that prints on an invoice.

Q: Where does the idempotency key come from?

Ideally the client supplies a stable Idempotency-Key per logical operation, the same value on retries. If clients cannot, derive one from a stable request fingerprint. Store it per account with a TTL longer than the billing window plus a retry grace period so a late retry cannot re-count.

When API usage drives an invoice, every counted request is money — an overcount overbills the customer and triggers a refund and a support ticket; an undercount is revenue you never collect. This guide explains why metered, billing-critical usage demands an exact sliding log rather than the approximate counters that are perfectly fine for ordinary throttling, and it builds the idempotency, audit, and reconciliation layer around it. It is the billing-grade companion to Tiered Access & Quota Enforcement, where the monthly quota was gated with a cheap INCR; here the same usage must be exact because it is billed, not merely capped.

The problem in concrete numbers

A customer is billed $0.002 per metered call and makes 8,000,000 calls/month — a $16,000 invoice. A sliding-window approximation with weighted interpolation typically carries ±0.5–1% error. At 1% that is ±$160 per customer per month: either you overbill by $160 (refunds, churn, possible regulatory exposure for usage-based contracts) or you undercharge by $160 (pure lost revenue, multiplied across every account). The error is not a rounding nuisance; it is a systematic financial leak. An exact sliding log counts every billable event once and exactly once, so the number on the invoice is the number of calls that happened.

Comparison: approximate vs exact counters for billing

Property	Approximate (sliding-window / fixed-window)	Exact sliding log
Count accuracy	±0.5–1% (interpolated)	Exact — one entry per event
Memory per key	O(1), a few counters	O(n) — one entry per request in window
Replay/double-count safe	No native idempotency	Idempotency key dedupes per event
Audit trail	None — only an aggregate number	Append-only log of every billable event
Reconcilable with billing	No per-event record to reconcile	Yes — event-level join to invoices
Right for	Throttling, soft quotas	Metered billing, usage-based contracts
Cost	Cheap	Storage + write amplification

The trade is explicit: the exact log costs O(n) memory and more writes, and you pay it precisely because the output is billed. For a non-billing rate limit, the approximation is the correct, cheaper choice — exactness there buys nothing.

Why approximate counters are unacceptable here

Interpolation invents a number. Sliding-window approximation estimates the current count by weighting the previous window; the result is a plausible figure, not a count of real events. You cannot put an estimate on an invoice.
No per-event record means no audit. When a customer disputes a charge, you must show which calls you billed and when. An aggregate counter has nothing to show; an append-only log does.
Retries double-count. A counter that INCRs on every received request charges retried requests twice. Billing requires that one logical operation counts once — that is idempotency, and an aggregate counter has no place to store the idempotency key.
No reconciliation path. Finance needs to join metered events to invoice line items. Without an event-level record there is nothing to reconcile against, so drift between the meter and the bill is invisible until a customer finds it.

Step-by-step: an idempotent, auditable sliding log

Require an idempotency key on every billable call (client-supplied, or derived from a stable request id).
On each call, atomically dedupe by idempotency key and append (timestamp, event_id) On each call, atomically dedupe by idempotency key and append `(timestamp, event_id)` to the account's sliding log.
Mirror every appended event to an append-only audit store (the billing source of truth).
Expose the windowed exact count for quota gating and for the invoice subtotal.
Reconcile the meter against the billing system on a schedule and alert on any drift.

-- Exact, idempotent billable-usage append. One atomic Redis call.
-- KEYS[1] = sliding log (sorted set)   KEYS[2] = idempotency set
-- ARGV: now_ms, window_ms, idempotency_key, event_id
-- Returns: {status, exact_count_in_window}
--   status = 'NEW' (counted) | 'DUP' (already counted, not recounted)
local now    = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local idem   = ARGV[3]
local event  = ARGV[4]

-- Dedupe: SADD returns 0 if the idempotency key was already seen.
if redis.call('SADD', KEYS[2], idem) == 0 then
  redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, now - window)   -- trim window
  return { 'DUP', redis.call('ZCARD', KEYS[1]) }
end
redis.call('EXPIRE', KEYS[2], math.ceil(window / 1000) + 86400)  -- keep idem long enough

-- Append the exact event (score = timestamp), trim, return exact count.
redis.call('ZADD', KEYS[1], now, event)
redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, now - window)
return { 'NEW', redis.call('ZCARD', KEYS[1]) }

# Append a billable event exactly once, then mirror to the audit store.
import time, uuid, redis
r = redis.Redis(decode_responses=True)
METER = r.register_script(LUA_SOURCE)   # the script above

def meter_call(account: str, idempotency_key: str, window_days: int = 31) -> dict:
    event_id = str(uuid.uuid4())
    now_ms = int(time.time() * 1000)
    window_ms = window_days * 86_400_000
    status, count = METER(
        keys=[f"usage:log:{account}", f"usage:idem:{account}"],
        args=[now_ms, window_ms, idempotency_key, event_id],
    )
    if status == "NEW":
        # Append-only audit row = billing source of truth (durable, never mutated).
        audit_append(account=account, event_id=event_id, idem=idempotency_key,
                     ts_ms=now_ms)
    return {"status": status, "billable_count": count}

The Redis sorted set serves the hot path (real-time count for quota gating); the append-only audit store (a partitioned table, an object-store log, or an event stream) is the durable record finance reconciles against. Redis can evict or fail; the audit log cannot.

Gotchas & edge cases

Idempotency key scope and TTL. The key must be unique per logical billable operation and kept at least as long as the billing window plus a retry grace period, or a late retry re-counts. Scope it per account so two accounts can reuse the same client-side id.
Redis is the cache, not the ledger. Never treat the sorted set as the system of record — it has a TTL and can be evicted under memory pressure. The append-only audit store is authoritative; rebuild Redis from it if needed.
Window trimming vs billing window. The sliding log is trimmed to the gating window; the billing total comes from the audit store over the billing period. Don’t bill from the trimmed Redis set.
At-least-once delivery upstream. If events arrive over a queue with at-least-once semantics, the idempotency dedupe is what makes the meter exact — without it the queue’s redeliveries overbill.
Clock source for ordering. Use a single clock (Redis TIME or an event-time field) so events order deterministically across nodes; see distributed algorithm sync.

Verification & testing

# Idempotency: send the SAME idempotency key 5 times -> count increments ONCE.
for i in $(seq 1 5); do
  curl -s -H "X-API-Key: meter_demo" -H "Idempotency-Key: op-abc-123" \
    -X POST https://api.example.com/v1/meter | jq -c '{status,billable_count}'
done
# Expect: first NEW, then four DUP; billable_count stays flat after the first.

# Reconciliation check: exact log count must equal the audit store row count.
LOG=$(redis-cli ZCARD usage:log:acct_42)
AUDIT=$(psql -tAc "SELECT count(*) FROM usage_audit WHERE account='acct_42' \
  AND ts_ms >= extract(epoch from now() - interval '31 days')*1000")
test "$LOG" = "$AUDIT" && echo "reconciled" || echo "DRIFT: log=$LOG audit=$AUDIT"

Alert on any non-zero drift between the meter and the audit store, and between the audit store and the billing system’s invoiced quantity; see alerting on 429 error rates for the alerting plumbing.

Frequently Asked Questions

Why not just use a fast approximate counter for billing?

Because an approximate counter carries ±0.5–1% error, which on an 8M-call account is ±$160/month of over- or under-billing, and it keeps no per-event record to audit or reconcile. Approximation is correct for throttling, where the exact number doesn't leave the system; it is wrong for anything that prints on an invoice.

Where does the idempotency key come from?

Ideally the client supplies a stable Idempotency-Key per logical operation (the same value on retries). If clients can't, derive one from a stable request fingerprint. Store it per account with a TTL longer than the billing window plus a retry grace period so a late retry can't re-count.

Is Redis the billing source of truth?

No. The Redis sorted set is the hot path for real-time gating and can be evicted or lost. The append-only audit store — a durable, never-mutated event log — is authoritative, and you rebuild the Redis view from it if needed. Bill from the audit store, not from Redis.

How do I reconcile the meter with the billing system?

Join metered events to invoice line items on event id over the billing period and compare quantities. Run it on a schedule and alert on any drift between the exact log, the audit store, and the invoiced total. Drift means a counting bug you must fix before the invoice goes out.

Tiered Access & Quota Enforcement — the parent topic on per-plan limits and quota gating.
Sliding Log Counters — the exact timestamp-ledger algorithm this builds on.
Per-Tier Quota Enforcement With Redis — the cheaper INCR-gated quota for non-billing limits.
API Key Scoping & Rate Limits — scoping the metered axis correctly.
Distributed Algorithm Sync — clock sources and deterministic event ordering.