Struktur

What is Struktur?

All-in-one tool for structured data extraction using an autonomous agent that turns documents into validated JSON.

Struktur is an all-in-one tool for structured data extraction using an autonomous agent. It turns documents into validated, schema-typed JSON by having an LLM agent explore the content, decide what to read, and build the output incrementally.

Why Struktur?

Large document batches arrive with data locked in semi-structured text. Invoices need to flow into spreadsheets. Product datasheets need to become database rows. The tooling exists, but the orchestration overhead is disproportionate to the extraction task itself.

Managed APIs charge per page, impose schema constraints, and require document uploads to external infrastructure. LLM SDKs provide raw model access but leave you to write chunking, validation, retries, and merging every time.

Struktur fills the gap: a focused extraction engine with an autonomous agent that handles the orchestration so you can focus on the output.

Why an Agent?

Traditional extraction strategies (simple, parallel, sequential) require you to choose the right approach upfront. The agent decides:

  • When to read — entire document or specific sections
  • How to search — grep for patterns, list directories, execute bash commands
  • What to extract — build output incrementally as it explores
  • How to validate — check against schema and retry automatically

The agent adapts to your document. Small invoices get read in one shot. Large catalogs get navigated systematically. The result is better accuracy without configuration complexity.

Why not managed APIs?

LimitationImpact
Per-page pricingDoes not scale for large batches
Schema constraintsYou work within their data model
Document uploadNon-starter for confidential workloads
Black-box behaviorDebugging extraction failures is opaque

Why not a plain LLM SDK call?

A single generateText() call gives you:

  • No chunking for large documents
  • No retries on schema validation failure
  • No merging of multi-chunk results
  • No typed output inferred from your schema

You write the same orchestration boilerplate every time. Struktur's agent packages that orchestration into a single, adaptive strategy.

Design philosophy

Agent-first, zero configuration. The agent strategy is the default. It explores documents autonomously, deciding when to read, search, or extract. No need to pick chunk sizes or parallelism upfront.

  • Autonomous exploration. The agent uses a virtual filesystem to read files, grep for patterns, find files, and execute commands. It builds output incrementally as it discovers data.
  • Shell-composable by default. Reads stdin, writes stdout, speaks JSON. Integrates with jq, find, curl, and any tool in your pipeline.
  • Validation in the loop. Errors go back to the model, not to you. The retry loop means most extractions converge within two attempts.
  • Schema-first. You define the shape, Struktur guarantees it.
  • Fields shorthand. Skip the JSON Schema boilerplate with --fields "title, price:number, status:enum{draft|live}".

Trade-offs

Trade-offRationale
Requires tool-calling modelsThe agent needs models that support function calling (Claude, GPT-4, etc.)
Depends on Vercel AI SDK providersOpenAI, Anthropic, Google supported; self-hosted models need OpenAI-compatible API
Token costs vary by documentThe agent makes multiple tool calls; large documents cost more than small ones

A 10-second demo

struktur extract --input invoice.pdf \
  --fields "number, vendor, total:number"

Expected output:

{
  "number": "1042",
  "vendor": "Acme Corp",
  "total": 2400
}

The agent reads the PDF, decides how to extract the fields, and returns validated JSON.

What Struktur is NOT

It is not a general document conversion tool. It parses files for extraction purposes, not for format conversion. It does not produce formatted output from documents.

  • It is not a managed API. It runs locally and calls your provider directly.
  • It does not stream. Input in, JSON out.
  • It is not a general LLM orchestration framework.

For the full mental model, see The Extraction Pipeline.

Who is it for?

CLI Users

Data engineers, analysts, shell pipeline builders — use Struktur for one-off extractions, batch processing, and CI/CD automation without writing code.

SDK Users

TypeScript developers embedding extraction in applications — use Struktur for typed results, custom strategies, and fine-grained control over the extraction pipeline.

What is the Agent Strategy?

The agent strategy is the default and recommended way to use Struktur. It implements:

  • Virtual filesystem tools — read, grep, find, ls, bash
  • Output management — set_output_data, update_output_data, finish, fail
  • Autonomous exploration — the agent decides what to do based on your schema
  • Incremental extraction — builds output as it discovers data

How it works

  1. The agent receives your schema and access to a virtual filesystem containing the document
  2. It can read files, search for patterns, list directories, and execute commands
  3. As it finds data, it calls set_output_data or update_output_data to build the result
  4. When complete, it calls finish to return validated JSON

When to use other strategies

The agent is the default and works best for most documents. However, other strategies are available for specific cases:

StrategyWhen to use
agent (default)Autonomous exploration — best for most documents
simpleSmall input that fits in one context window
parallelLarge input where speed matters more than accuracy
sequentialLarge input where order matters
parallelAutoMergeLarge arrays with parallel processing + deduplication
sequentialAutoMergeLarge arrays with sequential processing + deduplication
doublePassMaximum quality with two-pass refinement
doublePassAutoMergeMaximum quality with arrays + deduplication

See Extraction Strategies for details on all strategies.

Quick navigation

GoalSection
New here?Quickstart
Need to accomplish something?Examples
Looking up a flag or type?CLI Reference
Quick schema without writing JSON?Fields Shorthand
Want to understand how it works?Concepts
Parse files into artifacts?Document Parsing

On this page