Free tool · No sign-up required
HyDE Visualizer —
see why hypothetical embeddings retrieve better.
Paste a query and corpus chunks, generate a hypothetical answer via Claude Haiku 4.5 or GPT-4o mini, and watch all-MiniLM-L6-v2 cosine similarity scores shift as the query space bridges to the document space.
Embedding models, LLMs, and frameworks supported
What it shows you
In-browser semantic embeddings
Embedding similarity is computed locally via Transformers.js and WebAssembly — your data never leaves the browser. all-MiniLM-L6-v2 (22 MB) is free. BGE-Small-en-v1.5 (29 MB, stronger on technical text) is a Pro feature.
Real LLM generation
Enter your Anthropic or OpenAI API key to generate a real hypothetical answer document for any query using Claude Haiku 4.5 or GPT-4o mini. Seven pre-built examples run without any API key.
Per-chunk rank delta
Every corpus chunk shows its cosine similarity score under direct query embedding and under HyDE embedding, side by side. The rank delta badge (↑ +2, ↓ -1) shows exactly which chunks moved up or down in the retrieval window.
Bridge term analysis
Shows the shared vocabulary between the hypothetical answer and each corpus chunk — the bridging terms that explain why a chunk scored higher under HyDE. When no shared terms exist, it labels the match as a pure semantic match.
Generation cost breakdown
After generating with a live API key, see the token count, per-query cost, and monthly cost projections at 10K and 100K requests/day for both Claude Haiku 4.5 and GPT-4o mini. HyDE adds one LLM call per query — this shows exactly what that costs at scale.
Export pipeline code
Download ready-to-run Python code implementing the full HyDE retrieval pipeline pre-filled with your query, corpus chunks, and system prompt. Three frameworks: Raw Python with sentence-transformers, LangChain, and LlamaIndex's native HyDEQueryTransform. Pro feature.
HyDE pipeline
1. Query
Short user question
2. Generate hyp. doc
LLM answers hypothetically
3. Embed hyp. doc
all-MiniLM-L6 / BGE-Small
4. Vector search
cosine sim over corpus
5. Retrieved chunks
higher-quality top-k results
Standard RAG skips steps 2–3 and embeds the query directly — putting the embedding in question space rather than answer space.
Who uses it
RAG engineers
Your retrieval pipeline is returning irrelevant chunks, but the answer IS in the corpus. HyDE Visualizer shows whether switching from direct query embedding to hypothetical document embedding would surface the right passages — before you rewrite your pipeline.
ML researchers
Run the same query and corpus through both embedding strategies and compare per-chunk scores. The bridge term analysis shows exactly which vocabulary in the hypothetical answer aligns with your document corpus.
Backend developers
Evaluating whether the latency and cost of adding one extra LLM call per query is worth it? The cost breakdown shows per-query price and monthly projections at your request volume for Claude Haiku 4.5 and GPT-4o mini.