Redis Schema
Architecture documentation — Redis key schema, cache-aside pattern, sliding window rate limiting, write throttling, and data durability.
The caching layer uses Redis for four purposes: configuration caching, rate limiting, write throttling, and usage buffering/flush coordination. All keys have a TTL — nothing persists indefinitely.
Quick Reader Guide
If you are not familiar with this stack, read the tables with this mental model:
- Key prefix: groups related Redis keys (similar to a namespace).
- TTL: auto-expiry time; Redis deletes the key after this window.
- Source of truth: where durable/canonical data lives (usually Postgres).
- Control flag/lock key: coordination-only state; losing it usually affects timing, not correctness.
For the user-facing rate limiting guide (behavior, configuration, tuning), see Rate Limiting.
Key Schema Overview
All Redis keys use a structured prefix convention. Every key has a TTL — no key persists indefinitely.
| Prefix | Purpose | TTL | Example |
|---|---|---|---|
cache:project:id: | Project config cache (by ID) | 60s | cache:project:id:uuid-123 |
cache:project:slug: | Project config cache (by slug) | 60s | cache:project:slug:my-blog |
cache:project:team-slug: | Team+Project config cache | 60s | cache:project:team-slug:acme/my-blog |
cache:apikey:pk: | API key config cache | 60s | cache:apikey:pk:pk_abc123 |
ratelimit:ipx:minute: | Per-minute rate limit counter | ~60s | ratelimit:ipx:minute:pk_abc123 |
ratelimit:ipx:day: | Per-day rate limit counter | ~24h | ratelimit:ipx:day:pk_abc123 |
usage:apikey: | API key write throttle lock | 30s | usage:apikey:uuid-123 |
usage:project: | Project write throttle lock | 30s | usage:project:uuid-456 |
usage:buffer:minute: | Buffered usage counters (minute bucket hash) | 14 days (2 weeks) | usage:buffer:minute:202603071230 |
usage:buffer:flush:lock | Flush worker lock | ~55s | usage:buffer:flush:lock |
Configuration Cache (Cache-Aside)
File: src/server/lib/config-cache.ts
Pattern: Cache-Aside (Lazy Population)
Cached Data
Every image request requires two database lookups before processing:
- ProjectConfig — project ID, slug, team ID, allowed Referer domains (HTTP
Refererheader), allowed source domains - ApiKeyConfig — key ID, secret key (stored encrypted in cache, decrypted on read), expiration, revocation status, rate limits
Read Flow
TTL Values
Positive cache: 60 seconds (CACHE_TTL_SECONDS)
Negative cache: 10 seconds (NEGATIVE_CACHE_TTL_SECONDS)A shorter TTL gives fresher data but more database queries. A longer TTL gives better hit rates but delays propagation of changes. 60 seconds balances both — admin operations bypass this via active invalidation, so the TTL only matters for the rare case where invalidation fails.
Cache Key Formats
| Data | Key Format | Example |
|---|---|---|
| Project (by ID) | cache:project:id:{projectId} | cache:project:id:uuid-123 |
| Project (by slug) | cache:project:slug:{slug} | cache:project:slug:my-blog |
| Project (by team + slug) | cache:project:team-slug:{teamSlug}/{projectSlug} | cache:project:team-slug:acme/my-blog |
| API Key (by public key) | cache:apikey:pk:{publicKey} | cache:apikey:pk:pk_abc123 |
Negative Caching
When a lookup returns no result (project not found / key not found), a sentinel value (__NOT_FOUND__) is cached with a shorter TTL of 10 seconds. This prevents repeated requests for non-existent slugs or public keys (e.g. probing attacks) from hitting the database on every request.
The shorter TTL ensures that when a resource is later created, it becomes visible within 10 seconds — much faster than the full 60-second positive cache TTL. Active invalidation also clears negative cache entries immediately.
Invalidation
Cache entries are removed in two ways:
- Automatic — Redis evicts the key after TTL expires.
- Manual (immediate) — Dashboard mutations call invalidation functions:
| Dashboard Action | Invalidation Function | Effect |
|---|---|---|
| Update project settings | invalidateProjectCache(slug, projectId?) | Deletes all cache entries for that project (slug, ID, team+slug) |
| Delete project | invalidateProjectCache(slug, projectId?) + invalidate all project API keys | Clears project and all associated key caches |
| Revoke / update API key | invalidateApiKeyCache(publicKey) | Deletes the cache entry for that public key |
Serialization and Secret Handling
ApiKeyConfig requires special handling for Redis storage:
- Secret keys are stored in their encrypted form (
encryptedSecretKey) in Redis — never as plaintext — and are only decrypted viadecryptApiKey()after retrieval from cache. This prevents leaking secrets to the Redis layer. - Date fields (
expiresAt,revokedAt) are stored as ISO 8601 strings since JSON has no native Date type, and converted back toDateobjects on read.
A dedicated CachedApiKeyConfig type enforces both transformations at compile time.
Why Revoked Keys Are Cached
The database query in getApiKeyConfig deliberately does not filter by revokedAt IS NULL. This ensures that when the cache refreshes from the database, the revokedAt timestamp is present in the cached entry. The route handler then performs a defense-in-depth check: if apiKey.revokedAt is set, the request is rejected with 401.
Without this, the filter would cause revoked keys to return null from the database — indistinguishable from a non-existent key — making the route handler's revocation check dead code.
Active invalidation via invalidateApiKeyCache remains the primary revocation mechanism; caching revokedAt acts as a safety net for stale entries.
Rate Limiting (Sliding Window)
File: src/server/lib/rate-limiter.ts
Pattern: Sliding Window Counter (via @upstash/ratelimit)
Implementation Flow
Redis Key Prefixes
| Layer | Redis Key Prefix | Window |
|---|---|---|
| Per-day | ratelimit:ipx:day: | 24 hours |
| Per-minute | ratelimit:ipx:minute: | 1 minute |
Check Order Rationale
The per-day limit is checked first. This is deliberate: Upstash's .limit() is a consume-and-check operation — it decrements the counter atomically before returning the result. If the minute limit were checked first, a successful minute check would consume a minute token; a subsequent day rejection would block the request, but the minute token is already spent. Checking the day limit first inverts the problem: if day fails, no minute token is consumed.
Note: Upstash also provides a non-consuming
getRemaining()method that can query remaining tokens without decrementing. This could be used as a pre-check to avoid any token waste, at the cost of an extra Redis round trip per request. The current order-swap approach avoids this overhead while eliminating the most impactful waste scenario.
Default Fallback Values
Rate limits are per API key and stored in the database. When null, fallback values are applied in config-cache.ts:
rateLimitPerMinute: apiKey.rateLimitPerMinute ?? 60, // default fallback
rateLimitPerDay: apiKey.rateLimitPerDay ?? 10000, // default fallbackInstance Lifecycle
Ratelimit instances are lightweight configuration objects — they hold a reference to the shared Redis client, the window algorithm, and a key prefix. They are stateless: all counters live in Redis.
In the current implementation, instances are cached in-memory by limit value:
minuteLimiterCache: Map<number, Ratelimit>dayLimiterCache: Map<number, Ratelimit>
This avoids repeated object construction when many requests use the same effective limits.
Resetting Counters
For testing or administrative purposes, use the resetRateLimit function:
import { resetRateLimit } from "@/server/lib/rate-limiter";
await resetRateLimit("pk_abc123");This clears the sliding window counters in Redis for the specified key prefix.
Analytics
Rate limit instances are created with analytics: true, which sends usage metrics to the Upstash dashboard. You can view rate limit hit/miss statistics at console.upstash.com.
Usage Activity Throttling (SET NX / "set if absent")
File: src/server/lib/usage-tracker.ts
Pattern: Distributed Lock via SET NX (Set-If-Not-Exists)
SET ... NX EX 30 means "create this key only if it does not already exist, and auto-expire it after 30 seconds."
Implementation Flow
Atomicity Guarantee
SET key value NX EX 30 is atomic in Redis. Even if 10 serverless instances execute it simultaneously for the same key, exactly one succeeds:
T=0:
Instance A: SET usage:apikey:abc "1" NX EX 30 → "OK" (writes DB)
Instance B: SET usage:apikey:abc "1" NX EX 30 → null (skips)
Instance C: SET usage:apikey:abc "1" NX EX 30 → null (skips)
T=30 (key expires):
Instance D: SET usage:apikey:abc "1" NX EX 30 → "OK" (writes DB)Why SET NX Instead of In-Memory Batching
A previous design used an in-memory Map with setTimeout to batch writes every 5 seconds. This fails in serverless because:
setTimeoutdoes not fire if the container is recycled before the timer- Pending updates in the
Mapare permanently lost on container recycle - Multiple instances maintain separate Maps, limiting deduplication effectiveness
SET NX EX solves all three problems: it is a single atomic Redis command with no dependency on timers, container lifecycle, or instance-local state.
Execution Model
Write throttling runs as fire-and-forget — it does not await the result and does not block the image response. Errors are caught and logged to console.error but never propagated to the caller.
Usage Metering Buffer + Flush
Files: src/server/lib/usage-tracker.ts, src/app/api/cron/flush-usage-buffer/route.ts
Pattern: Request-path Redis buffering + cron batch upsert to Postgres
Purpose
Avoid per-request database upserts on the image-serving hot path while still aggregating usage totals into usage_record.
Write Path
On successful image responses:
- Request path records usage increments into Redis hash buckets by minute.
- Bucket key format:
usage:buffer:minute:{YYYYMMDDHHmm}(UTC minute). - Hash fields:
{projectId}|{apiKeyId}|req{projectId}|{apiKeyId}|bytes
Flush Path
Default free-tier mode runs flush as part of /api/cron/daily-maintenance (once per day).
When plan limits allow, you can additionally schedule /api/cron/flush-usage-buffer at higher frequency.
Flush flow:
- Acquires lock key
usage:buffer:flush:lock(SET NX EX). - Scans
usage:buffer:minute:*. - Skips too-recent buckets (safety lag).
- Aggregates each bucket into
(projectId, apiKeyId, date)rows. - Writes a durable per-bucket flush ledger row in Postgres, then applies additive upserts into
usage_recordonly when the bucket has not been flushed before. - Deletes the Redis bucket after the transaction commits (safe to retry if cleanup is interrupted).
Consistency
This design is eventually consistent by intent:
- Totals in
usage_recordcan lag recent traffic. - The lag is bounded by cron interval + flush safety lag (daily on free tier, lower when high-frequency flush is enabled).
Why This Instead of Direct DB Writes
- Reduces hot-row contention on
usage_recordunique keys. - Keeps image response path non-blocking.
- Makes write amplification and DB load tunable via cron cadence.
Data Durability
Redis contains zero persistent data. Every key falls into one of five categories:
| Category | Source of Truth | If Redis Is Wiped |
|---|---|---|
| Config cache | PostgreSQL | Next request refills from database (higher latency for one request) |
| Rate limit counters | None (ephemeral by nature) | Counters reset to zero (briefly allows over-limit requests) |
| Activity throttle locks | None (control flags) | Triggers one extra metadata UPDATE per key (harmless) |
| Flush lock | None (coordination flag, derived from application state/DB) | May trigger extra flush attempt; buffered increments retained. |
| Usage buffer buckets | usage_record after flush | Any not-yet-flushed increments are lost |
When Redis is unavailable, rate limiting is designed to fail open (allowed: true) to preserve image-serving availability.
Related Documentation
| Document | Description |
|---|---|
| Architecture Overview | Recommended architecture reading path |
| Request Lifecycle | Runtime request path that consumes these Redis keys |
| Rate Limiting | User-facing rate limiting guide |
| Domain Whitelisting | Domain whitelist configuration |
| Security Best Practices | Security recommendations |
Last updated on
Control Plane and Multi-Tenancy
How OptStuff models tenancy and manages teams, projects, API keys, and access control through the dashboard and tRPC routers.
User Onboarding Flow
Architecture documentation — complete user onboarding journey from sign-up to first API key, including team creation, project setup, and documentation links strategy.