Rate Limits
MCP request limits and rate limit headers
Rate Limits
Varolio MCP enforces rate limits to ensure fair usage and protect service stability.
Limits
| Scope | Limit | Window |
|---|---|---|
| Per workspace | 600 requests | 1 minute (sliding window) |
| Per user | 60 requests | 1 minute (sliding window) |
Both limits apply simultaneously. The stricter limit takes effect.
Rate limit response
When you exceed the limit, the server returns HTTP 429 Too Many Requests with the following headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp (ms) when the window resets |
Retry-After | Seconds to wait before retrying |
How MCP clients handle rate limits
Most MCP clients handle 429 responses automatically by waiting and retrying. If your client doesn't retry automatically, wait for the duration specified in the Retry-After header before sending the next request.
Tips
- Batch related questions into a single prompt so the AI makes fewer tool calls
- Use specific filters (e.g.,
status: "open") to reduce the number of follow-up queries - The per-user limit (60/min) is the one you're most likely to hit during normal usage