utils
Utility commands for working with artifacts.
utils artifact-viewer
Generates a self-contained HTML file for exploring artifact JSON in a browser.
struktur utils artifact-viewer --input artifacts.json --output viewer.html
struktur parse --input doc.pdf --images | struktur utils artifact-viewer --stdin > viewer.htmlOptions
Prop
Type
What the viewer shows
Default View
Artifact-by-artifact view with cards showing type, page count, and image count
Batching Mode
Chunking visualization with configurable parameters and live token counts
Default view features:
- Each artifact as a card with header showing type, page count, and image count
- Text content with expand/collapse per-slice (truncated at 500 chars, full text on click)
- Image thumbnails with click-to-enlarge modal
- Screenshot images marked with an orange "screenshot" badge
- Image dimensions overlaid on each thumbnail
- Metadata section (collapsible)
Batching Mode features:
- Sidebar listing batches and chunks with token and image counts
- Main area shows each chunk with a dashed amber border at chunk boundaries
- Configurable chunking parameters: Max Tokens, Max Images, Text Ratio, Image Tokens
- Image type filter: show/hide embedded images and screenshots independently
- Token and image counts update live as parameters change
The viewer embeds a JavaScript implementation of Struktur's chunking algorithm so batching mode accurately reflects what parallel, sequential, and other chunked strategies will do with your documents.
Workflow example
# Parse a PDF, inspect it in the browser before extracting
struktur parse --input contract.pdf --images --screenshots --output contract-artifacts.json
struktur utils artifact-viewer --input contract-artifacts.json --output viewer.html
open viewer.html
# Decide on chunking parameters, then extract
struktur extract --input contract.pdf --images --schema schema.json \
--strategy parallelAutoMerge --chunk-size 8000 --model openai/gpt-4oSee also
- parse — Generate artifact JSON from files
- Artifact Format — Understanding artifacts
- Chunking & Token Budgets — How chunking works