Retrieve
Retrieve the most relevant cited passages for a query. This is the core endpoint — it runs semantic search over the corpus, reranks, and returns source-attributed chunks ready to feed to your model.
/v1/retrieveSend a natural-language query with optional filters and retrieval options. The response is structured JSON — never a generated answer.
Request body#
querystringrequiredfiltersobjectoptionalfilters fields below.top_knumberoptional1–50. Defaults to 10, and is clamped to your plan’s maximum (Free 10, Pro 25, Enterprise 50) — a higher value is silently capped, never rejected.rerankbooleanoptionaltrue (default), results are semantically reranked over a wider candidate pool and recency-boosted so the most relevant and recent passages rank first. Set false to skip reranking entirely and return chunks in raw vector-similarity order — faster, and useful when you want to apply your own ranking.include_segmentsbooleanoptionaltrue, each chunk’s source includes a segments array — the underlying document segments with character offsets — so you can map a citation back to an exact span for highlighting. Defaults to false.filters#
tickersstring[]optionalyearnumberoptionalquarter"Q1" | "Q2" | "Q3" | "Q4"optionalsource_typesstring[]optionalExample request#
{
"query": "What did management say about data center capex?",
"filters": {
"tickers": ["NVDA"],
"year": 2025,
"quarter": "Q1",
"source_types": ["earnings_call"]
},
"top_k": 8,
"rerank": true,
"include_segments": false
}Response#
Returns a chunks array and a meta object. Each chunk is a passage with provenance you can cite directly.
chunk#
idstringrequiredchunk_01. Useful for citation markers.textstringrequiredscorenumberoptionalevidenceTextstringoptionalsourceobjectrequiredsource#
documentIdstringrequireddocumentTitlestringrequireddocumentTypestringrequiredtickerstring | nulloptionalyearnumber | nulloptionalquarterstring | nulloptionalfilingTypestring | nulloptionalsourceUrlstring | nulloptionalpageNumbersnumber[]optionalsegmentsobject[]optionalinclude_segments is true. The document segments behind this chunk, each with id, sequence, content, and charStart/charEnd offsets for highlighting. See below.source.segments#
Returned per chunk only when you pass include_segments: true. Each entry is a contiguous segment of the source document.
idstringrequiredsequencenumberrequiredcontentstringrequiredcharStartnumberrequiredcharEndnumberrequiredExample response#
{
"chunks": [
{
"id": "chunk_01",
"text": "Data center revenue grew 25% sequentially to a record $22.6 billion, driven by Hopper demand...",
"score": 0.18,
"evidenceText": "Data center revenue grew 25% sequentially to a record $22.6 billion.",
"source": {
"documentId": "doc_8f2a1c",
"documentTitle": "NVIDIA Q1 2025 Earnings Call",
"documentType": "earnings_call",
"ticker": "NVDA",
"year": 2025,
"quarter": "Q1",
"filingType": null,
"sourceUrl": null
}
}
],
"meta": {
"total": 8,
"periodMismatch": null,
"requestId": "req_a1b2c3"
}
}Highlighting with segments#
Pass include_segments: true to get the underlying document segments on each chunk’s source, with character offsets you can use to highlight the exact span inside the full document.
"segments": [
{
"id": "seg_0",
"sequence": 0,
"content": "Thanks, and good afternoon, everyone.",
"charStart": 0,
"charEnd": 37
},
{
"id": "seg_1",
"sequence": 1,
"content": "Data center revenue grew 25% sequentially to a record $22.6 billion.",
"charStart": 39,
"charEnd": 107
}
]charStart and charEnd are computed at read time by concatenating the document’s segments in sequence order, joined with a \n\n separator. They are positions into that reconstructed text — not stored columns — so use the same join when you reassemble the document, and treat charEnd as exclusive.Period fallback#
If you request a specific year and quarter that isn’t in the corpus, the API does not return empty. It serves the nearest prior period and flags it in meta.periodMismatch — because forward-looking guidance for, say, Q4 often lives in the Q3 call.
"periodMismatch": {
"requested": "Q4 2025",
"served": ["AMD Q3 2025"],
"message": "No document for Q4 2025 exists in the corpus. The returned chunks come from the nearest prior period(s): AMD Q3 2025..."
}periodMismatch is present, the chunks are from a different period than you asked for. Surface this to your users and instruct your model not to present prior-period evidence as the requested period.Errors#
Errors use the standard envelope ({ success: false, error: {...} }). Common statuses: 400 (invalid request body), 401 (bad API key), 429 (rate limited). See Errors.