OptStuff

Redis Schema

Architecture documentation — Redis key schema, cache-aside pattern, sliding window rate limiting, write throttling, and data durability.

The caching layer uses Redis for four purposes: configuration caching, rate limiting, write throttling, and usage buffering/flush coordination. All keys have a TTL — nothing persists indefinitely.

Quick Reader Guide

If you are not familiar with this stack, read the tables with this mental model:

  • Key prefix: groups related Redis keys (similar to a namespace).
  • TTL: auto-expiry time; Redis deletes the key after this window.
  • Source of truth: where durable/canonical data lives (usually Postgres).
  • Control flag/lock key: coordination-only state; losing it usually affects timing, not correctness.

For the user-facing rate limiting guide (behavior, configuration, tuning), see Rate Limiting.


Key Schema Overview

All Redis keys use a structured prefix convention. Every key has a TTL — no key persists indefinitely.

PrefixPurposeTTLExample
cache:project:id:Project config cache (by ID)60scache:project:id:uuid-123
cache:project:slug:Project config cache (by slug)60scache:project:slug:my-blog
cache:project:team-slug:Team+Project config cache60scache:project:team-slug:acme/my-blog
cache:apikey:pk:API key config cache60scache:apikey:pk:pk_abc123
ratelimit:ipx:minute:Per-minute rate limit counter~60sratelimit:ipx:minute:pk_abc123
ratelimit:ipx:day:Per-day rate limit counter~24hratelimit:ipx:day:pk_abc123
usage:apikey:API key write throttle lock30susage:apikey:uuid-123
usage:project:Project write throttle lock30susage:project:uuid-456
usage:buffer:minute:Buffered usage counters (minute bucket hash)14 days (2 weeks)usage:buffer:minute:202603071230
usage:buffer:flush:lockFlush worker lock~55susage:buffer:flush:lock

Configuration Cache (Cache-Aside)

File: src/server/lib/config-cache.ts
Pattern: Cache-Aside (Lazy Population)

Cached Data

Every image request requires two database lookups before processing:

  1. ProjectConfig — project ID, slug, team ID, allowed Referer domains (HTTP Referer header), allowed source domains
  2. ApiKeyConfig — key ID, secret key (stored encrypted in cache, decrypted on read), expiration, revocation status, rate limits

Read Flow

TTL Values

Positive cache:  60 seconds (CACHE_TTL_SECONDS)
Negative cache:  10 seconds (NEGATIVE_CACHE_TTL_SECONDS)

A shorter TTL gives fresher data but more database queries. A longer TTL gives better hit rates but delays propagation of changes. 60 seconds balances both — admin operations bypass this via active invalidation, so the TTL only matters for the rare case where invalidation fails.

Cache Key Formats

DataKey FormatExample
Project (by ID)cache:project:id:{projectId}cache:project:id:uuid-123
Project (by slug)cache:project:slug:{slug}cache:project:slug:my-blog
Project (by team + slug)cache:project:team-slug:{teamSlug}/{projectSlug}cache:project:team-slug:acme/my-blog
API Key (by public key)cache:apikey:pk:{publicKey}cache:apikey:pk:pk_abc123

Negative Caching

When a lookup returns no result (project not found / key not found), a sentinel value (__NOT_FOUND__) is cached with a shorter TTL of 10 seconds. This prevents repeated requests for non-existent slugs or public keys (e.g. probing attacks) from hitting the database on every request.

The shorter TTL ensures that when a resource is later created, it becomes visible within 10 seconds — much faster than the full 60-second positive cache TTL. Active invalidation also clears negative cache entries immediately.

Invalidation

Cache entries are removed in two ways:

  1. Automatic — Redis evicts the key after TTL expires.
  2. Manual (immediate) — Dashboard mutations call invalidation functions:
Dashboard ActionInvalidation FunctionEffect
Update project settingsinvalidateProjectCache(slug, projectId?)Deletes all cache entries for that project (slug, ID, team+slug)
Delete projectinvalidateProjectCache(slug, projectId?) + invalidate all project API keysClears project and all associated key caches
Revoke / update API keyinvalidateApiKeyCache(publicKey)Deletes the cache entry for that public key

Serialization and Secret Handling

ApiKeyConfig requires special handling for Redis storage:

  • Secret keys are stored in their encrypted form (encryptedSecretKey) in Redis — never as plaintext — and are only decrypted via decryptApiKey() after retrieval from cache. This prevents leaking secrets to the Redis layer.
  • Date fields (expiresAt, revokedAt) are stored as ISO 8601 strings since JSON has no native Date type, and converted back to Date objects on read.

A dedicated CachedApiKeyConfig type enforces both transformations at compile time.

Why Revoked Keys Are Cached

The database query in getApiKeyConfig deliberately does not filter by revokedAt IS NULL. This ensures that when the cache refreshes from the database, the revokedAt timestamp is present in the cached entry. The route handler then performs a defense-in-depth check: if apiKey.revokedAt is set, the request is rejected with 401.

Without this, the filter would cause revoked keys to return null from the database — indistinguishable from a non-existent key — making the route handler's revocation check dead code.

Active invalidation via invalidateApiKeyCache remains the primary revocation mechanism; caching revokedAt acts as a safety net for stale entries.


Rate Limiting (Sliding Window)

File: src/server/lib/rate-limiter.ts
Pattern: Sliding Window Counter (via @upstash/ratelimit)

Implementation Flow

Redis Key Prefixes

LayerRedis Key PrefixWindow
Per-dayratelimit:ipx:day:24 hours
Per-minuteratelimit:ipx:minute:1 minute

Check Order Rationale

The per-day limit is checked first. This is deliberate: Upstash's .limit() is a consume-and-check operation — it decrements the counter atomically before returning the result. If the minute limit were checked first, a successful minute check would consume a minute token; a subsequent day rejection would block the request, but the minute token is already spent. Checking the day limit first inverts the problem: if day fails, no minute token is consumed.

Note: Upstash also provides a non-consuming getRemaining() method that can query remaining tokens without decrementing. This could be used as a pre-check to avoid any token waste, at the cost of an extra Redis round trip per request. The current order-swap approach avoids this overhead while eliminating the most impactful waste scenario.

Default Fallback Values

Rate limits are per API key and stored in the database. When null, fallback values are applied in config-cache.ts:

rateLimitPerMinute: apiKey.rateLimitPerMinute ?? 60,    // default fallback
rateLimitPerDay: apiKey.rateLimitPerDay ?? 10000,       // default fallback

Instance Lifecycle

Ratelimit instances are lightweight configuration objects — they hold a reference to the shared Redis client, the window algorithm, and a key prefix. They are stateless: all counters live in Redis.

In the current implementation, instances are cached in-memory by limit value:

  • minuteLimiterCache: Map<number, Ratelimit>
  • dayLimiterCache: Map<number, Ratelimit>

This avoids repeated object construction when many requests use the same effective limits.

Resetting Counters

For testing or administrative purposes, use the resetRateLimit function:

import { resetRateLimit } from "@/server/lib/rate-limiter";

await resetRateLimit("pk_abc123");

This clears the sliding window counters in Redis for the specified key prefix.

Analytics

Rate limit instances are created with analytics: true, which sends usage metrics to the Upstash dashboard. You can view rate limit hit/miss statistics at console.upstash.com.


Usage Activity Throttling (SET NX / "set if absent")

File: src/server/lib/usage-tracker.ts
Pattern: Distributed Lock via SET NX (Set-If-Not-Exists)

SET ... NX EX 30 means "create this key only if it does not already exist, and auto-expire it after 30 seconds."

Implementation Flow

Atomicity Guarantee

SET key value NX EX 30 is atomic in Redis. Even if 10 serverless instances execute it simultaneously for the same key, exactly one succeeds:

T=0:
  Instance A: SET usage:apikey:abc "1" NX EX 30 → "OK"    (writes DB)
  Instance B: SET usage:apikey:abc "1" NX EX 30 → null     (skips)
  Instance C: SET usage:apikey:abc "1" NX EX 30 → null     (skips)

T=30 (key expires):
  Instance D: SET usage:apikey:abc "1" NX EX 30 → "OK"    (writes DB)

Why SET NX Instead of In-Memory Batching

A previous design used an in-memory Map with setTimeout to batch writes every 5 seconds. This fails in serverless because:

  1. setTimeout does not fire if the container is recycled before the timer
  2. Pending updates in the Map are permanently lost on container recycle
  3. Multiple instances maintain separate Maps, limiting deduplication effectiveness

SET NX EX solves all three problems: it is a single atomic Redis command with no dependency on timers, container lifecycle, or instance-local state.

Execution Model

Write throttling runs as fire-and-forget — it does not await the result and does not block the image response. Errors are caught and logged to console.error but never propagated to the caller.


Usage Metering Buffer + Flush

Files: src/server/lib/usage-tracker.ts, src/app/api/cron/flush-usage-buffer/route.ts
Pattern: Request-path Redis buffering + cron batch upsert to Postgres

Purpose

Avoid per-request database upserts on the image-serving hot path while still aggregating usage totals into usage_record.

Write Path

On successful image responses:

  • Request path records usage increments into Redis hash buckets by minute.
  • Bucket key format: usage:buffer:minute:{YYYYMMDDHHmm} (UTC minute).
  • Hash fields:
    • {projectId}|{apiKeyId}|req
    • {projectId}|{apiKeyId}|bytes

Flush Path

Default free-tier mode runs flush as part of /api/cron/daily-maintenance (once per day).
When plan limits allow, you can additionally schedule /api/cron/flush-usage-buffer at higher frequency.

Flush flow:

  1. Acquires lock key usage:buffer:flush:lock (SET NX EX).
  2. Scans usage:buffer:minute:*.
  3. Skips too-recent buckets (safety lag).
  4. Aggregates each bucket into (projectId, apiKeyId, date) rows.
  5. Writes a durable per-bucket flush ledger row in Postgres, then applies additive upserts into usage_record only when the bucket has not been flushed before.
  6. Deletes the Redis bucket after the transaction commits (safe to retry if cleanup is interrupted).

Consistency

This design is eventually consistent by intent:

  • Totals in usage_record can lag recent traffic.
  • The lag is bounded by cron interval + flush safety lag (daily on free tier, lower when high-frequency flush is enabled).

Why This Instead of Direct DB Writes

  • Reduces hot-row contention on usage_record unique keys.
  • Keeps image response path non-blocking.
  • Makes write amplification and DB load tunable via cron cadence.

Data Durability

Redis contains zero persistent data. Every key falls into one of five categories:

CategorySource of TruthIf Redis Is Wiped
Config cachePostgreSQLNext request refills from database (higher latency for one request)
Rate limit countersNone (ephemeral by nature)Counters reset to zero (briefly allows over-limit requests)
Activity throttle locksNone (control flags)Triggers one extra metadata UPDATE per key (harmless)
Flush lockNone (coordination flag, derived from application state/DB)May trigger extra flush attempt; buffered increments retained.
Usage buffer bucketsusage_record after flushAny not-yet-flushed increments are lost

When Redis is unavailable, rate limiting is designed to fail open (allowed: true) to preserve image-serving availability.


DocumentDescription
Architecture OverviewRecommended architecture reading path
Request LifecycleRuntime request path that consumes these Redis keys
Rate LimitingUser-facing rate limiting guide
Domain WhitelistingDomain whitelist configuration
Security Best PracticesSecurity recommendations

Last updated on

On this page