Struktur
Examples

Watch a Folder for New Files

Process files as they arrive in a folder.

Linux: inotifywait

For PDFs and other supported formats, use --input directly — no pre-processing required:

inotifywait -m ./incoming -e create -e moved_to |
  while read -r path action file; do
    echo "New file: $file"
    struktur --input "$path/$file" \
      --schema schema.json \
      --model openai/gpt-4o-mini \
      --output "processed/$file.json"
    mv "$path/$file" ./processed/
  done

For formats without a built-in parser, pipe through a conversion tool first:

inotifywait -m ./incoming -e create -e moved_to |
  while read -r path action file; do
    echo "New file: $file"
    markitdown "$path/$file" | struktur --stdin \
      --schema schema.json \
      --model openai/gpt-4o-mini \
      --output "processed/$file.json"
    mv "$path/$file" ./processed/
  done

macOS: fswatch

fswatch -o ./incoming | while read f; do
  for file in ./incoming/*; do
    [ -f "$file" ] || continue
    echo "Processing: $file"
    struktur --input "$file" \
      --schema schema.json \
      --model openai/gpt-4o-mini \
      --output "processed/$(basename $file).json"
    mv "$file" ./processed/
  done
done

Output to JSONL

For streaming ingestion, append to a JSONL file (one JSON object per line):

inotifywait -m ./incoming -e create -e moved_to |
  while read -r path action file; do
    struktur --input "$path/$file" \
      --schema schema.json \
      --model openai/gpt-4o-mini \
      >> processed.jsonl
    mv "$path/$file" ./processed/
  done

SDK: fs.watch

import { watch } from "node:fs";
import { extract, simple, parse } from "@struktur/sdk";
import { openai } from "@ai-sdk/openai";
import fs from "node:fs/promises";
import path from "node:path";

const schema = /* your schema */;
const incomingDir = "./incoming";
const processedDir = "./processed";

await fs.mkdir(processedDir, { recursive: true });

const watcher = watch(incomingDir, async (event, filename) => {
  if (!filename || event !== "rename") return;

  const filePath = path.join(incomingDir, filename);
  try {
    await fs.access(filePath);
  } catch {
    return;  // File was deleted, not created
  }

  console.log(`Processing: ${filename}`);

  try {
    // parse handles MIME detection and parsing (PDF, text, images, etc.)
    const artifacts = await parse({ kind: "file", path: filePath });

    const result = await extract({
      artifacts,
      schema,
      strategy: simple({ model: openai("gpt-4o-mini") }),
    });

    const outputPath = path.join(processedDir, `${filename}.json`);
    await fs.writeFile(outputPath, JSON.stringify(result.data, null, 2));

    await fs.unlink(filePath);
    console.log(`  ✓ Processed: ${filename}`);
  } catch (error) {
    console.error(`  ✗ Failed: ${error.message}`);
  }
});

console.log(`Watching ${incomingDir}...`);

SDK: chokidar

For more robust file watching:

import chokidar from "chokidar";
import { extract, simple, parse } from "@struktur/sdk";
import { openai } from "@ai-sdk/openai";
import fs from "node:fs/promises";
import path from "node:path";

const schema = /* your schema */;

const watcher = chokidar.watch("./incoming", {
  ignored: /(^|[\/\\])\../,
  persistent: true,
  awaitWriteFinish: {
    stabilityThreshold: 2000,
    pollInterval: 100
  },
});

watcher.on("add", async (filePath) => {
  console.log(`Processing: ${path.basename(filePath)}`);

  try {
    const artifacts = await parse({ kind: "file", path: filePath });

    const result = await extract({
      artifacts,
      schema,
      strategy: simple({ model: openai("gpt-4o-mini") }),
    });

    const outputPath = `./processed/${path.basename(filePath)}.json`;
    await fs.writeFile(outputPath, JSON.stringify(result.data, null, 2));
    await fs.unlink(filePath);

    console.log(`  ✓ Processed`);
  } catch (error) {
    console.error(`  ✗ Failed: ${error.message}`);
  }
});

console.log("Watching ./incoming...");

See also

On this page