Introduction

Freemium AI wrappers usually gate generation behind credits. Serverless handlers are stateless: in-memory counters reset every cold start. A Postgres row per decrement works but adds latency and connection-pool complexity at high fan-out.

Upstash Redis exposes Redis over HTTPS with regional endpoints — a strong fit for atomic INCRBY / DECRBY or Lua scripts that check balance and deduct in one round trip.

Threat model

Attackers replay API routes with stolen cookies or leaked JWTs. The credit check must be server-side only, after session resolution, and must use keys bound to the authenticated user id, not client-supplied strings without verification.

Use a namespace prefix per product, for example:

credits:{userId} → integer balance
rate:{userId}:{minuteBucket} → request count for soft throttles

Choose TTLs for ephemeral rate keys; balance keys persist until subscription changes.

Operation sequence

  1. Authenticate the request (session JWT, Better Auth, etc.).
  2. WATCH/MULTI/EXEC or a Lua script: if balance ≥ cost, subtract cost and return OK; else return 402-style error to the client.
  3. Only after a successful debit, call the fal.ai (or other) upstream with the server’s API key.
  4. On upstream failure, implement a refund path (increment back) or mark a compensating transaction — product decision.

Relation to rate limiting

The same Redis database can back global rate limits (IP + route) separate from entitlement balances. Keep keyspaces distinct to avoid accidental FLUSHDB during debugging.

Observability

Log deduction events with user id hashes, never raw emails. Dashboards should show burn rate per SKU so coordinators can spot a misconfigured credit cost.

Disclosure

[PREPRINT] — run AIPeerReviewPublication/_review_agent/review_pipeline_main.py (or GitHub Issue submission flow) before treating Gemini review panels as authoritative.