Overview

Rate Limits & Quotas

Limits are enforced per API key. Three dimensions apply: a request rate (per minute), a monthly quota (total calls), and the maximum top_k you can request per call.

How limits apply#

Limits are tracked per API key, not per IP — so your throughput is your own, independent of where requests originate. Each call to POST /v1/retrieve counts against both your rate limit and your monthly quota.

Your key also has a maximum top_k you can request. If you send a top_k above your plan maximum — but at or below the absolute ceiling of 50 — the API clamps it down rather than rejecting the call. A top_k above 50 is rejected with 400 Bad Request.

Plan limits#

API access is paid; a key is provisioned by our team on one of the plans below. If your workload needs more, tell us your expected volume and we’ll raise the limits on your key.

PlanRate / minMonthly quotaMax top_k

Free201,00010

Pro120100,00025

Enterprise6002,000,00050

Paid access

The API is paid and there is no anonymous tier — a key is provisioned by our team. Contact us to get a key, and let us know your expected volume if you need limits raised.

When you hit a limit#

Exceeding your per-minute rate or your monthly quota returns 429 Too Many Requests in the standard error envelope. (A top_k above your plan maximum is clamped down instead, as long as it’s at or below the absolute ceiling of 50; a top_k above 50 is rejected with 400 Bad Request.)

429.jsonjson

{
  "success": false,
  "error": {
    "statusCode": 429,
    "message": "Rate limit exceeded",
    "error": "Too Many Requests",
    "path": "/v1/retrieve",
    "timestamp": "2026-05-30T12:00:00.000Z"
  }
}

The message field distinguishes the two limits: a per-minute breach reads "Rate limit exceeded", while exceeding your monthly quota reads "Monthly quota exceeded". Both are 429.

Handling 429s#

Back off and retry with exponential backoff and jitter. A simple approach:

retry.tsts

async function retrieveWithRetry(body, { maxRetries = 4 } = {}) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const res = await fetch("https://api.focusalpha.ai/v1/retrieve", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.FOCUSALPHA_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify(body),
    });

    if (res.status !== 429) return res;

    // exponential backoff with full jitter
    const delay = Math.random() * Math.min(1000 * 2 ** attempt, 15000);
    await new Promise((r) => setTimeout(r, delay));
  }
  throw new Error("Rate limit: retries exhausted");
}

Staying under the limit

Batch related questions where you can, cache results you’ll reuse, and scope queries with filters so each call does more useful work. If you consistently need more headroom, talk to us about a higher plan.

PreviousAuthentication

Next MCP Server