Splitter - markdown
Overview
Markdown splitter is an implementation of the Document Transformer interface that splits by Markdown heading levels. It follows Eino: Document Transformer Guide.
How It Works
- Detect Markdown headings (
#,##,###, etc.) - Build a document structure tree by levels
- Split the document into fragments by headings
Usage
Initialization
Initialize via NewHeaderSplitter with configuration:
splitter, err := markdown.NewHeaderSplitter(ctx, &markdown.HeaderConfig{
Headers: map[string]string{
"#": "h1", // H1
"##": "h2", // H2
"###": "h3", // H3
},
TrimHeaders: false, // keep heading lines in output
})
Parameters:
Headers: required, map heading tokens to metadata keysTrimHeaders: whether to remove heading lines in output
Complete Example
package main
import (
"context"
"github.com/cloudwego/eino-ext/components/document/transformer/splitter/markdown"
"github.com/cloudwego/eino/schema"
)
func main() {
ctx := context.Background()
// Initialize splitter
splitter, err := markdown.NewHeaderSplitter(ctx, &markdown.HeaderConfig{
Headers: map[string]string{ "#": "h1", "##": "h2", "###": "h3" },
TrimHeaders: false,
})
if err != nil { panic(err) }
// Prepare documents to split
docs := []*schema.Document{{
ID: "doc1",
Content: `# Document Title
Intro section.
## Chapter One
Chapter one content.
### Section 1.1
Section 1.1 content.
## Chapter Two
Chapter two content.
\`\`\`
# This is a comment in a code block and will not be detected as a heading
\`\`\`
`,
}}
// Execute splitting
results, err := splitter.Transform(ctx, docs)
if err != nil { panic(err) }
// Process split results
for i, doc := range results {
println("fragment", i+1, ":", doc.Content)
println("heading levels:")
for k, v := range doc.MetaData {
if k == "h1" || k == "h2" || k == "h3" { println(" ", k, ":", v) }
}
}
}
Features
- Supports fenced code blocks ``` and ~~~
- Automatically maintains heading hierarchy
- New peer headings reset lower-level headings
- Heading level info is passed via metadata
References
Last modified
December 16, 2025
: fix: improve readability of websocket and swagger docs (#1480) (f63ff55)