Bitwise Knowledge Assistant

What Are Chunks?

A “chunk” is a piece of a document that has been processed and indexed for search. When you ask a question, the system searches these chunks to find relevant information, then passes the most relevant chunks to the language model to generate a response.

Why Is Content Split?

Language models have context limits—they can only process a certain amount of text at once. Additionally, searching through entire documents would be slow and imprecise. Splitting content into chunks provides several benefits:

Precision: Smaller chunks mean more targeted results
Context limits: Fits within LLM token limits
Speed: Faster vector search on smaller units
Relevance: Each chunk can be scored independently

Chunk Size Trade-offs

Smaller Chunks

✓ More precise retrieval

✓ Better for specific facts

✗ May lose context

✗ More chunks to search

Larger Chunks

✓ More context preserved

✓ Better for complex topics

✗ May include irrelevant content

✗ Uses more context window

The system uses adaptive chunking based on document type, typically targeting 500-1500 characters per chunk with semantic boundaries.

Semantic vs. Keyword Matching

Semantic Search (Vector)

Converts your query and chunks into mathematical vectors (embeddings) and finds chunks whose meaning is similar, even if they use different words.

Example: “Bitcoin ETF cost” matches “BITB expense ratio is 0.20%”

Keyword Search (BM25)

Traditional text matching that finds chunks containing the exact words in your query. Excellent for specific terms, names, and tickers.

Example: “BITB” finds chunks containing exactly “BITB”

The system can use both approaches together (hybrid search) and merge results for the best of both worlds. See Experiment Variants to learn about enabling hybrid search.

Why Some Content Isn't Found

If the system doesn't find content you expect, there are several possible reasons:

Common Retrieval Issues

Relevance threshold: Content exists but scored below the threshold
Semantic gap: Query wording too different from document language
Not ingested: Document hasn't been added to the knowledge base
Chunking split: Information split across multiple chunks
Entity mismatch: Query mentions a product not in the chunk

Debugging Retrieval Issues

To troubleshoot why content isn't being retrieved:

Check the sources panel: After each response, review the source documents shown. Are the expected sources appearing at all?
Try different wording: Rephrase your query using terms that appear in the document. Specific product names and tickers often help.
Lower the relevance threshold: In retrieval settings, try reducing the minimum relevance score to see if content appears.
Enable hybrid search: Use the “Hybrid” experiment variant to add keyword matching alongside semantic search.
Check entity filters: If asking about a specific product, ensure the entity overlap filter isn't being too aggressive.

The Retrieval Pipeline

Query→Embed→Vector Search→Filter→Rerank→Generate

Query: User's question is analyzed for complexity and intent
Embed: Query converted to 2048-dimensional vector
Vector Search: Find similar chunks in Vertex AI Vector Search
Filter: Apply length, relevance, and entity filters
Rerank: Optionally reorder results for better relevance
Generate: Pass top chunks to LLM for response generation

Retrieval Settings Guide - Configure filters and thresholds
Understanding Experiment Variants - Configure retrieval techniques
Document Ingestion Explained - How documents become chunks