Skip to content
llmconsole

RAG Chunk Visualizer

Paste a document, see how it gets chunked under fixed, recursive, or markdown-aware strategies side-by-side.

1,361 chars · 199 words
Strategy

Try paragraphs → sentences → chars. Generic, robust.

5 chunksavg 0 tokens · total 0 tokens
# Why retrieval-augmented generation matters Large language models are powerful but bounded. They can't know what happened after their training cut-off, they can't reference your private documents, and even when they can, they hallucinate confidently. ## The core ideaRetrieval-augmented generation (RAG) sidesteps these limits by retrieving relevant documents at query time and stuffing them into the model's context window. Instead of asking the model what it knows, you ask it to answer a question *given* the documents you've supplied. ## Chunking is the hidden leverThe retrieval quality of a RAG system is dominated by how you split your documents into chunks. Chunks that are too small lose context. Chunks that are too large dilute relevance and burn tokens. Boundaries that cross logical sections — mid-sentence, mid-paragraph, mid-section — produce noisy retrieval. There are several common strategies:1. **Fixed-size** with a small overlap. Simple, predictable, ignores structure. 2. **Recursive** which tries paragraph boundaries first, then sentences, then characters as a last resort. 3. **Markdown-aware** for documentation, which respects heading hierarchies. 4. **Semantic** which uses embeddings to group similar adjacent sentences.Each has tradeoffs. The point of this tool is to make them visible side-by-side on a real document.

Chunks (5)

#CharsTokensPreview
1268# Why retrieval-augmented generation matters Large language models are powerful
2303Retrieval-augmented generation (RAG) sidesteps these limits by retrieving releva
3341The retrieval quality of a RAG system is dominated by how you split your documen
43381. **Fixed-size** with a small overlap. Simple, predictable, ignores structure.
599Each has tradeoffs. The point of this tool is to make them visible side-by-side