Bitwise Knowledge Assistant

Overview

The retrieval settings control how the Knowledge Assistant filters and ranks documents before generating a response. These settings allow you to fine-tune the balance between precision (returning only highly relevant results) and recall (returning more results that might be useful). All settings can be configured in the Admin Console.

Content Length Filter

This filter removes chunks that are too short to contain meaningful information. Very short chunks often contain headers, footers, or fragmented text that adds noise without providing useful context.

Enable Content Length Filter

When enabled, chunks shorter than the minimum length are removed from search results before being passed to the language model.

Default: Enabled

Minimum Chunk Length

The minimum number of characters a chunk must have to be included in results. Lower values include more content but may introduce noise. Higher values ensure only substantial content is used but may filter out valid short answers.

Default: 100 characters | Range: 0-500 characters

When to Adjust

Increase if you're seeing fragmented or unhelpful content in responses
Decrease if important short facts (like definitions or key metrics) are being filtered out
Disable if your knowledge base contains many short-form documents like FAQs

Complexity-Aware Retrieval

This feature adjusts how many candidate documents are retrieved based on the complexity of the user's query. Simple, straightforward questions need fewer candidates, while complex questions benefit from a wider search.

Enable Complexity-Aware Candidate K

When enabled, the system analyzes query complexity and adjusts the number of candidate documents retrieved. Simple queries (e.g., "What is BITB's expense ratio?") retrieve fewer candidates, reducing noise and improving response speed.

Default: Enabled

Simple Query Multiplier

Controls how much to reduce the candidate pool for simple queries. A multiplier of 2x means simple queries retrieve half as many candidates as complex queries. Lower values make the filter more aggressive.

Default: 2x | Range: 1x-5x

How Query Complexity is Determined

The system classifies queries into complexity levels based on:

Simple: Direct factual questions with clear answers (e.g., "What is X?")
Medium: Questions requiring synthesis of multiple facts
Complex: Analytical questions, comparisons, or multi-part queries

Intent-Aware Relevance Thresholds

Different types of questions have different relevance requirements. A factual question about a specific product needs highly relevant results, while an exploratory question about a broad topic can benefit from a wider range of moderately relevant content.

Enable Intent-Aware Thresholds

When enabled, the minimum relevance score required for a chunk to be included varies based on the detected intent of the query. This allows the system to be more strict for factual queries and more permissive for exploratory ones.

Default: Enabled

Query Intent Types

Factual Queries

Questions seeking specific facts, numbers, or definitions. Examples: "What is BITB's expense ratio?" or "When was ETHW launched?"

Behavior: Higher threshold ensures only highly relevant chunks are used, reducing the risk of incorrect information.

Default threshold: 20%

Comparison Queries

Questions comparing multiple items or asking about differences. Examples: "How does BITB compare to GBTC?" or "What's the difference between spot and futures ETFs?"

Behavior: Moderate threshold allows retrieving information about multiple products to enable meaningful comparisons.

Default threshold: 15%

Temporal Queries

Questions about time-based information, trends, or historical data. Examples: "What happened to Bitcoin in 2024?" or "How has BITB performed this year?"

Behavior: Moderate threshold with preference for recent documents.

Default threshold: 15%

Exploratory Queries

Broad questions seeking general information or learning about a topic. Examples: "Tell me about crypto ETFs" or "What products does Bitwise offer?"

Behavior: Lower threshold includes a wider range of content to provide comprehensive overviews.

Default threshold: 10%

Understanding Relevance Scores

Relevance scores range from 0% to 100%, representing how closely a chunk matches the query:

0-10%: Low relevance, likely unrelated content
10-20%: Marginal relevance, may contain tangentially related information
20-40%: Moderate relevance, contains related information
40%+: High relevance, directly addresses the query

Entity Overlap Filter

This filter ensures that when users ask about specific products or assets, the retrieved chunks actually mention those entities. This prevents the model from conflating information about different products.

Enable Entity Overlap Filter

When enabled and a user asks about a specific product (e.g., "BITB"), chunks that don't mention that product are filtered out. This is only applied to factual queries where precision is critical.

Default: Enabled

Fallback Chunk Count

If the entity filter is too aggressive and removes all chunks, the system falls back to keeping the top N chunks by relevance score. This ensures there's always some context for the model to work with.

Default: 3 chunks | Range: 1-10 chunks

When Entity Filtering Applies

Applied: Factual queries mentioning specific products or assets
Not applied: Exploratory queries, comparisons, or general questions

Recommended Configurations

Use Case	Recommended Settings
High precision (factual queries)	All filters enabled, higher thresholds (25%+ for factual)
Broad coverage (exploratory queries)	Lower thresholds (5-10%), entity filter disabled
FAQ-style knowledge base	Lower min chunk length (50), all other defaults
Document-heavy knowledge base	Higher min chunk length (150), complexity-aware enabled
Balanced (default)	All defaults - good balance of precision and recall

Monitoring Impact

After adjusting settings, monitor the impact using the Analytics Dashboard:

User feedback: Track positive/negative feedback rates
Source clicks: Higher click rates suggest more relevant sources
Response times: More aggressive filtering improves speed
Follow-up questions: Fewer follow-ups may indicate better initial answers

Retrieval Settings Guide

Overview

Content Length Filter

Enable Content Length Filter

Minimum Chunk Length

When to Adjust

Complexity-Aware Retrieval

Enable Complexity-Aware Candidate K

Simple Query Multiplier

How Query Complexity is Determined

Intent-Aware Relevance Thresholds

Enable Intent-Aware Thresholds

Query Intent Types

Factual Queries

Comparison Queries

Temporal Queries

Exploratory Queries

Understanding Relevance Scores

Entity Overlap Filter

Enable Entity Overlap Filter

Fallback Chunk Count

When Entity Filtering Applies

Recommended Configurations

Monitoring Impact