- Per-minute rate limit — sliding window, counted per API key
- Monthly request cap — counted per org, resets on the 1st of each calendar month (UTC)
Plan matrix
| Plan | Monthly cap | Per-minute limit | Typical use |
|---|---|---|---|
| Free | 100 requests | 10 / min | Build and test your integration |
| Starter | 10,000 | 60 / min | Small production app |
| Pro | 100,000 | 300 / min | Real production traffic |
| Enterprise | Custom | Custom | Email support@medresolve |
When you hit the limit
Both caps return HTTP 429 with a machine-readable detail:Retry-After header with the number of
seconds until the oldest in-window request ages out. Respect it.
How enforcement works
- Rate limit is enforced per container (we run on Cloud Run). The effective global limit is your plan limit × number of concurrent containers. For stricter guarantees reach out — we can move to a Redis-backed limiter.
- Monthly cap is enforced per org and cached 60s. You may slightly overshoot during that cache window, but only by a handful of requests. We honor the cap strictly once the cache refreshes.
- Only 2xx responses count against your cap. Failed auth, rate-limit rejects, and validation errors don’t consume quota.
- Demo/anonymous requests don’t count (when key auth is disabled in dev — not applicable to production).
Monitoring your usage
- Portal dashboard — real-time “X of Y used this month” progress bar at Dashboard → Usage
- Admin API —
GET /admin/planandGET /admin/usageare Clerk-authenticated endpoints consumed by the portal; you can hit them from your own infra if you need to script usage reporting