Search & Chat
Hybrid Search
Search combines two retrieval strategies fused with Reciprocal Rank Fusion (RRF):
Dense (HNSW) — Semantic similarity via embeddings (understands meaning)
Sparse (BM25) — Keyword matching for exact terms (handles acronyms, IDs)
Query Pipeline

Optional Features
Feature | Description | When to use |
|---|---|---|
Query condensation | Rewrites multi-turn conversations into standalone queries | Multi-turn chat |
HyDE | Generates hypothetical answer, embeds that instead | Ambiguous queries |
Reranking | Cross-encoder rescoring of top candidates | Higher precision needed |
Parent resolution | Returns parent chunks for broader context | Need surrounding context |
Metadata Filtering
Every search is automatically scoped by:
tenant_id— strict isolation between tenantsparse_generation— excludes stale chunks from re-parsed documentsavailable_int— respects soft-deleted documents
Streaming Chat
The RAG chat provides conversational access to knowledge bases with citation support.
OpenAI-Compatible Endpoint
Request
Field | Type | Required | Description |
|---|---|---|---|
| array | Yes | Conversation history (last must be |
| boolean | No | Stream response via SSE (default: true) |
| string | No | Override the dialog's configured model |
| float | No | Override temperature (0.0–2.0) |
| integer | No | Override max output tokens |
Response (SSE)
Citations
Search results include positional metadata for citation overlays:
Field | Description |
|---|---|
| Source document UUID |
| Page number in original document |
| Block type: |
| Bounding box coordinates |
The frontend uses these to highlight the exact source location in a PDF viewer.
Dialogs
A dialog is a pre-configured chat profile that binds together:
An LLM model
A system prompt
One or more datasets to search
Search parameters (top_k, similarity threshold, reranking)
Dialogs can be shared externally via dialog API tokens for embedded chat widgets.
LLM Integration
All LLM calls go through LiteLLM proxy — no direct vendor SDK imports:
Model identifiers: openai/gpt-4o, anthropic/claude-3-5-sonnet, ollama/llama3
Configure providers in Settings → Providers with API key and base URL.