This glossary defines technical terms used throughout the Knowledge Assistant documentation. Terms are listed alphabetically.
A keyword-based ranking function used for text retrieval. Unlike semantic search, BM25 matches exact terms and is excellent for finding documents with specific words like product names or tickers.
A segment of a document that has been processed and indexed for search. Documents are split into chunks to enable precise retrieval and fit within LLM context limits.
A fault tolerance pattern that prevents cascading failures. When an external API fails repeatedly, the circuit "opens" to stop making requests, protecting system stability.
A reranking model that processes query-document pairs together (rather than separately) for more accurate relevance scoring. Slower but more precise than bi-encoders.
A numerical vector representation of text that captures its semantic meaning. Similar texts have similar embeddings, enabling semantic search.
A recognized named item in a query, such as a product ticker (BITB), crypto asset (Bitcoin), or index name (BITWISE10). Entity recognition enables targeted retrieval.
A quality metric measuring whether a generated response is supported by the retrieved context. High faithfulness means the answer is grounded in source material, not hallucinated.
A technique where the LLM generates a hypothetical answer to the query, which is then embedded and used for retrieval. Helps find relevant documents for vague queries.
A retrieval approach that combines semantic vector search with keyword-based BM25 search, merging results using Reciprocal Rank Fusion (RRF) for better coverage.
The classified purpose of a query: factual (seeking specific facts), comparison (comparing items), temporal (time-based), or exploratory (general learning).
A protocol for providing LLMs with external data and tools. The Knowledge Assistant uses MCP to fetch live market data from the Bitwise API.
A retrieval quality metric measuring how high the first relevant result appears. MRR of 1.0 means the best result is always first; lower values mean relevant content is ranked lower.
The percentage of the top K retrieved results that are actually relevant. High precision means less noise in results.
A technique where an LLM generates responses based on retrieved documents rather than only its training data. This grounds responses in specific, current information.
The percentage of all relevant documents that appear in the top K results. High recall means the system finds most of the relevant content.
A second-stage retrieval step where initial results are re-scored using a more sophisticated model. Improves result quality at the cost of latency.
An algorithm for merging ranked lists from different retrieval methods. Documents appearing high in multiple lists get boosted in the final ranking.
A search approach that finds content based on meaning rather than exact keywords. Uses embeddings to match concepts even when different words are used.
The time between submitting a query and receiving the first token of the response. A key UX metric representing perceived "thinking time."
A search technique that finds items by comparing their embedding vectors. Documents with vectors close to the query vector are considered similar.