API Reference

Retrieve

Retrieve the most relevant cited passages for a query. This is the core endpoint — it runs semantic search over the corpus, reranks, and returns source-attributed chunks ready to feed to your model.

POST/v1/retrieve

Send a natural-language query with optional filters and retrieval options. The response is structured JSON — never a generated answer.

Request body#

querystringrequired

The natural-language search query, e.g. “gross margin guidance for next quarter”. Must be 1–2000 characters; an empty or longer query is rejected with 400.

filtersobjectoptional

Optional scoping by company and period. See filters fields below.

top_knumberoptional

Number of chunks to return, an integer 1–50. Values outside 1–50 (or non-integers) are rejected with 400. Within that range, the value is additionally clamped down to your plan’s maximum (Free 10, Pro 25, Enterprise 50) — e.g. a Free-plan request for 40 returns 10. Defaults to 10.

rerankbooleanoptional

Controls relevance reranking. When true (default), results are semantically reranked over a wider candidate pool and recency-boosted so the most relevant and recent passages rank first. Set false to skip reranking entirely and return chunks in raw vector-similarity order — faster, and useful when you want to apply your own ranking.

include_segmentsbooleanoptional

When true, each chunk’s source includes a segments array — the underlying document segments with character offsets — so you can map a citation back to an exact span for highlighting. Defaults to false.

filters#

tickersstring[]optional

Restrict to one or more US tickers, e.g. ["AAPL"] or ["AMD", "NVDA"]. At most 25 tickers, each up to 10 characters.

yearnumberoptional

Fiscal year, e.g. 2025. Must be 2000–2100.

quarter"Q1" | "Q2" | "Q3" | "Q4"optional

Fiscal quarter. Combine with year for a precise period.

Example request#

request.jsonjson

{
  "query": "What did management say about data center capex?",
  "filters": {
    "tickers": ["NVDA"],
    "year": 2025,
    "quarter": "Q1"
  },
  "top_k": 8,
  "rerank": true,
  "include_segments": false
}

Response#

Returns a chunks array and a meta object. Each chunk is a passage with provenance you can cite directly.

chunk#

idstringrequired

Stable identifier within this response, e.g. chunk_01. Useful for citation markers.

textstringrequired

The full passage text. Feed this to your model.

scorenumberoptional

Relevance score (cosine distance — lower is more similar).

evidenceTextstringoptional

The one-to-three sentence span most relevant to the query, for highlighting.

sourceobjectrequired

Document provenance — see fields below.

source#

documentIdstring | nulloptional

Stable identifier for the source document, when available. Use it with the Documents API to fetch the full document or its segments.

documentTitlestringrequired

Human-readable document title.

documentTypestringrequired

Currently "earnings_call".

tickerstring | nulloptional

Company ticker.

yearnumber | nulloptional

Fiscal year of the document.

quarterstring | nulloptional

Fiscal quarter, e.g. "Q1".

filingTypestring | nulloptional

Reserved for future document types; null for transcripts.

sourceUrlstring | nulloptional

Link to the primary source document.

pageNumbersnumber[]optional

Page numbers the chunk spans, for paginated source types. Omitted for earnings-call transcripts (which have no page structure).

segmentsobject[]optional

Present only when include_segments is true. The document segments behind this chunk, each with id, sequence, content, and charStart/charEnd offsets for highlighting. See below.

source.segments#

Returned per chunk only when you pass include_segments: true. Each entry is a contiguous segment of the source document.

idstringrequired

Segment identifier within the document.

sequencenumberrequired

Zero-based position of the segment within the document, in reading order.

contentstringrequired

The segment text.

charStartnumberrequired

Start character offset of this segment within the reconstructed document text.

charEndnumberrequired

End character offset (exclusive) of this segment.

Example response#

response.jsonjson

{
  "chunks": [
    {
      "id": "chunk_01",
      "text": "Data center revenue grew 25% sequentially to a record $22.6 billion, driven by Hopper demand...",
      "score": 0.18,
      "evidenceText": "Data center revenue grew 25% sequentially to a record $22.6 billion.",
      "source": {
        "documentId": "doc_8f2a1c",
        "documentTitle": "NVIDIA Q1 2025 Earnings Call",
        "documentType": "earnings_call",
        "ticker": "NVDA",
        "year": 2025,
        "quarter": "Q1",
        "filingType": null,
        "sourceUrl": null
      }
    }
  ],
  "meta": {
    "total": 8,
    "periodMismatch": null,
    "requestId": "req_a1b2c3"
  }
}

Highlighting with segments#

Pass include_segments: true to get the underlying document segments on each chunk’s source, with character offsets you can use to highlight the exact span inside the full document.

source.segmentsjson

"segments": [
  {
    "id": "seg_0",
    "sequence": 0,
    "content": "Thanks, and good afternoon, everyone.",
    "charStart": 0,
    "charEnd": 37
  },
  {
    "id": "seg_1",
    "sequence": 1,
    "content": "Data center revenue grew 25% sequentially to a record $22.6 billion.",
    "charStart": 39,
    "charEnd": 107
  }
]

How offsets are computed

charStart and charEnd are computed at read time by concatenating the document’s segments in sequence order, joined with a \n\n separator. They are positions into that reconstructed text — not stored columns — so use the same join when you reassemble the document, and treat charEnd as exclusive.

Period fallback#

If you request a specific year and quarter that isn’t in the corpus, the API does not return empty. It serves the nearest prior period and flags it in meta.periodMismatch — because forward-looking guidance for, say, Q4 often lives in the Q3 call.

meta.periodMismatchjson

"periodMismatch": {
  "requested": "Q4 2025",
  "served": ["AMD Q3 2025"],
  "message": "No document for Q4 2025 exists in the corpus. The returned chunks come from the nearest prior period(s): AMD Q3 2025..."
}

Honor the notice

When periodMismatch is present, the chunks are from a different period than you asked for. Surface this to your users and instruct your model not to present prior-period evidence as the requested period.

Errors#

Errors use the standard envelope ({ success: false, error: {...} }). Common statuses: 400 (invalid request body), 401 (bad API key), 429 (rate limited — covers both per-minute rate limits and monthly-quota exhaustion), and 500 (internal/infra fault). See Errors.

PreviousCompany Facts

Next Documents