Glossary of Terms

This glossary defines technical terms used throughout the Knowledge Assistant documentation. Terms are listed alphabetically.

BM25

A keyword-based ranking function used for text retrieval. Unlike semantic search, BM25 matches exact terms and is excellent for finding documents with specific words like product names or tickers.

Chunk

A segment of a document that has been processed and indexed for search. Documents are split into chunks to enable precise retrieval and fit within LLM context limits.

Circuit Breaker

A fault tolerance pattern that prevents cascading failures. When an external API fails repeatedly, the circuit "opens" to stop making requests, protecting system stability.

Cross-Encoder

A reranking model that processes query-document pairs together (rather than separately) for more accurate relevance scoring. Slower but more precise than bi-encoders.

Embedding

A numerical vector representation of text that captures its semantic meaning. Similar texts have similar embeddings, enabling semantic search.

Entity

A recognized named item in a query, such as a product ticker (BITB), crypto asset (Bitcoin), or index name (BITWISE10). Entity recognition enables targeted retrieval.

Faithfulness

A quality metric measuring whether a generated response is supported by the retrieved context. High faithfulness means the answer is grounded in source material, not hallucinated.

HyDE (Hypothetical Document Embedding)

A technique where the LLM generates a hypothetical answer to the query, which is then embedded and used for retrieval. Helps find relevant documents for vague queries.

Intent

The classified purpose of a query: factual (seeking specific facts), comparison (comparing items), temporal (time-based), or exploratory (general learning).

MCP (Model Context Protocol)

A protocol for providing LLMs with external data and tools. The Knowledge Assistant uses MCP to fetch live market data from the Bitwise API.

MRR (Mean Reciprocal Rank)

A retrieval quality metric measuring how high the first relevant result appears. MRR of 1.0 means the best result is always first; lower values mean relevant content is ranked lower.

Precision@K

The percentage of the top K retrieved results that are actually relevant. High precision means less noise in results.

RAG (Retrieval-Augmented Generation)

A technique where an LLM generates responses based on retrieved documents rather than only its training data. This grounds responses in specific, current information.

Recall@K

The percentage of all relevant documents that appear in the top K results. High recall means the system finds most of the relevant content.

Reranking

A second-stage retrieval step where initial results are re-scored using a more sophisticated model. Improves result quality at the cost of latency.

RRF (Reciprocal Rank Fusion)

An algorithm for merging ranked lists from different retrieval methods. Documents appearing high in multiple lists get boosted in the final ranking.

TTFT (Time to First Token)

The time between submitting a query and receiving the first token of the response. A key UX metric representing perceived "thinking time."