Introduction
Freemium AI wrappers usually gate generation behind credits. Serverless handlers are stateless: in-memory counters reset every cold start. A Postgres row per decrement works but adds latency and connection-pool complexity at high fan-out.
Upstash Redis exposes Redis over HTTPS with regional endpoints — a strong fit for atomic INCRBY / DECRBY or Lua scripts that check balance and deduct in one round trip.
Threat model
Attackers replay API routes with stolen cookies or leaked JWTs. The credit check must be server-side only, after session resolution, and must use keys bound to the authenticated user id, not client-supplied strings without verification.
Recommended key layout
Use a namespace prefix per product, for example:
credits:{userId} → integer balancerate:{userId}:{minuteBucket} → request count for soft throttles
Choose TTLs for ephemeral rate keys; balance keys persist until subscription changes.
Operation sequence
- Authenticate the request (session JWT, Better Auth, etc.).
- WATCH/MULTI/EXEC or a Lua script: if balance ≥ cost, subtract cost and return OK; else return 402-style error to the client.
- Only after a successful debit, call the fal.ai (or other) upstream with the server’s API key.
- On upstream failure, implement a refund path (increment back) or mark a compensating transaction — product decision.
Relation to rate limiting
The same Redis database can back global rate limits (IP + route) separate from entitlement balances. Keep keyspaces distinct to avoid accidental FLUSHDB during debugging.
Observability
Log deduction events with user id hashes, never raw emails. Dashboards should show burn rate per SKU so coordinators can spot a misconfigured credit cost.
Disclosure
[PREPRINT] — run AIPeerReviewPublication/_review_agent/review_pipeline_main.py (or GitHub Issue submission flow) before treating Gemini review panels as authoritative.