Free tool · No sign-up required

Debug your RAG pipeline
chunk by chunk.

Paste any document and instantly see how tiktoken, sentence boundaries, and paragraph splits divide your text into chunks for Pinecone, Weaviate, and LlamaIndex pipelines.

Try RAG Chunk Inspector free →Free to use

All processing in your browser — no data stored or transmitted

Designed for these vector stores, frameworks, and embedding models

PineconeWeaviateQdrantpgvectorLangChainLlamaIndextiktokentext-embedding-ada-002text-embedding-3-smallGPT-4oClaude 3.5 SonnetGemini 1.5 Pro

What you can do

Every tool you need to tune
chunk quality before indexing

4 chunking strategies

Switch between token-based (tiktoken / GPT-4o tokenizer), character-based, sentence-boundary, and paragraph-boundary strategies with a single click. Each updates the visualization instantly.

Live chunk boundary visualization

Every chunk is colour-coded directly in the source document. Overlap regions appear in amber. Hover a chunk card to locate it in the document, or click to open a full LLM context preview.

Token distribution histogram

See the spread of token counts across all chunks at a glance. An expandable guide explains what tight spikes, bell curves, and skewed distributions mean for retrieval quality in systems like Pinecone or Weaviate.

LLM context preview

Click any chunk to see a realistic LLM prompt: system prompt + retrieved chunk + user message, with per-section token counts. Edit the system prompt and question to see how total token usage changes before you build.

Side-by-side strategy comparison

Switch to Compare mode to see all four chunking strategies side by side — chunk count, average token size, size range, and a mini histogram for each. Pick the right strategy before you write indexing code. Pro feature. Pro feature.

Export chunks as JSON

Download all chunks as a structured JSON array with id, text, token_count, char_count, start_char, end_char, and has_overlap — ready to pipe directly into your Pinecone, Weaviate, or Qdrant ingestion script. Pro feature. Pro feature.

Chunking strategies explained

Four strategies. One document. Wildly different results.

TOKEN

›Uses GPT-4o tokenizer (tiktoken / o200k_base)
›Precise boundary at token N — never mid-token
›Best match to how embedding models count input
›Uniform chunk sizes → predictable retrieval

SENTENCE

›Respects .?! punctuation boundaries
›Never cuts mid-sentence
›Ideal for prose, articles, documentation
›Bell-curve token distribution typical

PARAGRAPH

›Splits on double newlines (\n\n)
›Preserves thematic document units
›Great for structured docs (manuals, reports)
›Wide size range — monitor for oversized chunks

CHARACTER

›Splits on raw character count
›Simple — fast to understand and implement
›Variable token counts across chunks
›Baseline for comparing other strategies

Who uses it

Backend engineers building RAG APIs

Test your chunking configuration against real documents before writing an indexing pipeline. Catch edge cases — paragraphs that exceed ada-002's 8,191-token limit, or chunks so small they lose context — before they reach production.

AI engineers tuning LangChain and LlamaIndex pipelines

Understand exactly how LangChain's RecursiveCharacterTextSplitter or LlamaIndex's SentenceSplitter will behave on your specific documents. Then reproduce the winning configuration in code.

ML engineers evaluating embedding costs

The stats panel shows total tokens across all chunks and estimates text-embedding-ada-002 costs. Compare token overhead across strategies to find the most cost-efficient split before indexing a large corpus.

Try RAG Chunk Inspector free →

Paste a document. Tune chunk size and overlap. See exactly how tiktoken and your chosen strategy will split your text before you write a line of indexing code.

Open RAG Chunk Inspector See all ContextIQ tools

Debug your RAG pipelinechunk by chunk.

Every tool you need to tunechunk quality before indexing