Guide

What is deterministic retrieval?

Deterministic retrieval returns the same ranked results for the same query — every time. No embeddings, no approximate search, no drift. Here's what that means, how it differs from vector search, and why it matters for AI agents.

Deterministic vs probabilistic

Most modern retrieval is probabilistic. A query is embedded by a neural model, then matched against a vector index with approximate nearest-neighbor search and often a reranker. Each of those steps can shift between model versions or re-indexes, so the “same” query can return different results on different days. We call that the hot state.

Deterministic retrieval is the cold state: identical query in, identical ranking out — byte for byte. You can cache it, snapshot it, and diff it in CI, because nothing about it drifts between runs.

How it works

Instead of embedding text into vectors, ColdState indexes documents once and scores relevance from whole-word tokenization — a 2.8M-word vocabulary — plus transparent, concept-cluster ranking. There is no embedding model and no GPU in the query path, so results are computed the same way every time and the score can be explained token by token rather than inferred from a cosine distance.

Why agents need it

Reproducibility is what makes an agent debuggable. With deterministic retrieval, an AI agent can pin a knowledge snapshot for a whole run, cite a fact by its content hash, and later verify that the source hasn’t changed. Evals stop flaking, failures replay exactly, and every ranking decision can be audited after the fact.

Deterministic isn’t keyword-only

Deterministic doesn’t mean naïve string matching. ColdState models relevance and cross-domain structure through its cluster graph — it just does so without a stochastic embedding step at query time. If you’re weighing it against a RAG pipeline, see alternatives to RAG.


From concept to API

Search that can't drift

ColdState's deterministic search API gives you reproducible, explainable retrieval — free to start, no embeddings, no per-query inference cost.

Deterministic — same query, same ranked results, with a token-level reason.
Probabilistic — embeddings + approximate search that shift between runs.