OST Front-matter Reference Guide

This reference guide covers the complete YAML front-matter structure for OST files - Self-executing ostruct prompts.

What are OST Files?

OST files are self-executing ostruct prompts that package everything needed for AI-powered data processing into a single, executable file.

Regular ostruct workflow: - Execute with: ostruct run template.j2 schema.json --var name=value --file data.csv - Requires separate template file, schema file, and command-line configuration - User must know the specific variables, file types, and options needed

OST file workflow: - Execute with: ./my-tool.ost "input text" --format json or ostruct runx my-tool.ost --help - Everything bundled: template + schema + CLI configuration in one file - Self-documenting with built-in help (--help flag automatically generated) - Command-line arguments are predefined and validated

Key Differences:

Aspect

Regular ostruct templates

OST files (Self-executing prompts)

Execution

ostruct run template.j2 schema.json

./tool.ost or ostruct runx tool.ost

Schema

Separate .json file required

Embedded in YAML front-matter

Configuration

Command-line flags and variables

Pre-configured CLI with custom arguments

Help

Generic ostruct run --help

Custom ./tool.ost --help with tool-specific options

Portability

Requires multiple files + knowledge

Single file, self-contained

User Experience

Power users, developers

End users, simplified workflows

Note

This guide is designed for non-Python users who need to understand the OST format. No programming knowledge is required.

OST File Structure

OST files have three main sections:

  1. Shebang Line - Makes the file executable on Unix systems

  2. YAML Front-matter - Contains CLI metadata and schema (between --- markers)

  3. Template Content - Jinja2 template that generates the AI prompt

Basic Structure

#!/usr/bin/env -S ostruct runx
---
cli:
  name: my-tool
  description: Description of what this tool does
  # ... CLI configuration

schema: |
  {
    "type": "object",
    "properties": {
      "result": {"type": "string"}
    }
  }

defaults:
  # ... default values

global_args:
  # ... global argument policies
---
# Your Jinja2 template content goes here
Process this input: {{ input_text }}

Required Sections

cli

The cli section defines the command-line interface for your template.

Required Fields:

cli:
  name: my-tool-name
  description: Brief description of what the tool does

Optional Fields:

cli:
  name: my-tool-name
  description: Brief description of what the tool does
  positional:
    - name: input_text
      help: The text to process
      default: "Hello World"
  options:
    format:
      names: ["--format", "-f"]
      help: Output format
      default: "json"
      choices: ["json", "yaml", "text"]

schema

The schema section contains the JSON schema that defines the structure of the output.

schema: |
  {
    "type": "object",
    "properties": {
      "result": {
        "type": "string",
        "description": "The processed result"
      },
      "format": {
        "type": "string",
        "description": "The output format used"
      }
    },
    "required": ["result", "format"]
  }

Tip

Use the Schema Generator tool to create schemas automatically:

tools/schema-generator/run.sh -o my_schema.json my_template.j2

CLI Configuration

Positional Arguments

Define required or optional positional arguments:

cli:
  positional:
    - name: input_text
      help: The text to analyze
      # Optional: default value
      default: "Sample text"
    - name: output_file
      help: Where to save results
      # No default = required argument

Options (Flags)

Define command-line options with various behaviors:

Basic String Option

cli:
  options:
    format:
      names: ["--format", "-f"]
      help: Output format
      default: "json"
      choices: ["json", "yaml", "text"]

Boolean Flag

Method 1: Using action (recommended)

cli:
  options:
    verbose:
      names: ["--verbose", "-v"]
      help: Enable verbose output
      action: "store_true"  # Creates a boolean flag

Method 2: Using type

cli:
  options:
    debug:
      names: ["--debug"]
      help: Enable debug mode
      type: "bool"
      default: false

Repeatable Option

cli:
  options:
    tags:
      names: ["--tag", "-t"]
      help: Add a tag (can be used multiple times)
      action: "append"  # Allows multiple values

File Input Option

cli:
  options:
    config_file:
      names: ["--config"]
      help: Configuration file
      type: "file"
      target: "prompt"  # Template access only

    data_file:
      names: ["--data"]
      help: Data file for analysis
      type: "file"
      target: "ci"  # Code Interpreter

    docs_file:
      names: ["--docs"]
      help: Documentation file
      type: "file"
      target: "fs"  # File Search

Directory Input Option

cli:
  options:
    source_dir:
      names: ["--source"]
      help: Source directory
      type: "directory"
      target: "prompt"

Collection Input Option

cli:
  options:
    source_files:
      names: ["--files"]
      help: Collection of files matching a pattern
      type: "collection"
      target: "prompt"

    test_files:
      names: ["--tests"]
      help: Test files for analysis
      type: "collection"
      target: "ci"  # Code Interpreter

What it does: Collection type accepts glob patterns and collects multiple files that match the pattern. Unlike directory type which includes all files in a directory, collection type gives you precise control over which files are included.

Usage Examples:

# Collect all Python files
my-tool --files "**/*.py"

# Collect specific test files
my-tool --tests "test_*.py"

# Multiple patterns (if action: append is used)
my-tool --files "*.py" --files "*.js"

Action Parameters

The action parameter controls how command-line arguments are processed:

store (default)

Stores a single value:

format:
  names: ["--format"]
  action: "store"  # Default - can be omitted
  help: Output format

store_true

Creates a boolean flag that defaults to False:

verbose:
  names: ["--verbose", "-v"]
  action: "store_true"
  help: Enable verbose output

Usage: ./my_tool.ost --verbose sets verbose = True

store_false

Creates a boolean flag that defaults to True:

no_color:
  names: ["--no-color"]
  action: "store_false"
  help: Disable colored output

Usage: ./my_tool.ost --no-color sets no_color = False

append

Allows multiple values for the same option:

tags:
  names: ["--tag", "-t"]
  action: "append"
  help: Add a tag (repeatable)

Usage: ./my_tool.ost --tag work --tag urgent creates tags = ["work", "urgent"]

count

Counts how many times an option is used:

verbosity:
  names: ["--verbose", "-v"]
  action: "count"
  help: Increase verbosity level

Usage: ./my_tool.ost -vvv sets verbosity = 3

CLI-Level Global Arguments

The global_args section can be placed either at the top level or within the cli section. When placed within the CLI section, it provides tool-specific global argument policies:

cli:
  name: my-tool
  description: My custom tool
  global_args:
    model:
      mode: "fixed"
      value: "gpt-4o"
    --temperature:
      mode: "allowed"
      allowed: [0.1, 0.5, 1.0]
    pass_through_global: false

CLI-Level vs Top-Level global_args:

  • CLI-level: Tool-specific policies that override any top-level settings

  • Top-level: Default policies for the entire OST file

  • Precedence: CLI-level settings take priority over top-level settings

When to Use CLI-Level:

  • When you need different policies per tool in complex OST files

  • When the tool requires specific model restrictions

  • When you want to prevent certain global flags from being used

Example: Model Restriction

cli:
  name: secure-analyzer
  description: Security analysis tool
  global_args:
    model:
      mode: "fixed"
      value: "gpt-4o"  # Force secure model
    pass_through_global: false  # Block unknown flags
  options:
    target:
      names: ["--target"]
      help: File to analyze

This ensures the security tool always uses a specific model and prevents users from changing critical settings.

File Routing Targets

The target parameter controls where files are sent:

prompt (default)

Files are available in the template but not uploaded to external services:

config_file:
  names: ["--config"]
  type: "file"
  target: "prompt"  # Template access only

Template usage: {{ config_file.content }}

ci (Code Interpreter)

Files are uploaded to OpenAI’s Code Interpreter for analysis:

data_file:
  names: ["--data"]
  type: "file"
  target: "ci"  # Code Interpreter analysis

The AI can execute Python code to analyze the file.

ud (User Data)

Files are sent to vision models for analysis:

pdf_file:
  names: ["--pdf"]
  type: "file"
  target: "ud"  # User-data for vision models

Currently supports PDF files for vision analysis.

auto

Automatically routes files based on type detection:

auto_file:
  names: ["--auto"]
  type: "file"
  target: "auto"  # Auto-route by file type

Text files go to prompt, binary files to ud.

Validation and Choices

Restrict Input Values

Use choices to limit allowed values:

format:
  names: ["--format", "-f"]
  choices: ["json", "yaml", "text"]
  default: "json"
  help: Output format

Type Validation

Specify expected data types:

count:
  names: ["--count", "-c"]
  type: "int"
  default: 10
  help: Number of items to process

threshold:
  names: ["--threshold"]
  type: "float"
  default: 0.5
  help: Threshold value (0.0-1.0)

Required Arguments

Use required to make arguments mandatory:

input_file:
  names: ["--input", "-i"]
  help: Input file to process
  type: "file"
  required: true  # User must provide this argument
  target: "prompt"

api_key:
  names: ["--api-key"]
  help: API key for authentication
  required: true  # No default value, must be provided

Note: Arguments with no default value are automatically required. Use required: true to explicitly mark optional arguments as mandatory.

Default Values

The defaults section provides default values for template variables:

defaults:
  format: "json"
  verbose: false
  max_items: 100
  tags: []  # Empty list for append actions

These defaults are used when users don’t provide values.

Global Arguments Policy

The global_args section controls how users can interact with ostruct’s global flags.

Flag Naming Convention

Important: Global argument keys must use the exact flag format with dashes:

global_args:
  --model:        # Correct: with dashes
    mode: "allowed"
    allowed: ["gpt-4o", "gpt-4o-mini"]

  model:          # INCORRECT: will cause validation errors
    mode: "allowed"

Rule: Use the complete flag name including dashes (e.g., --model, --temperature, --enable-tool) exactly as you would type it on the command line.

Policy Configuration

global_args:
  pass_through_global: true  # Allow unknown flags

  --model:
    mode: "allowed"
    allowed: ["gpt-4o", "gpt-4.1", "o1"]
    default: "gpt-4.1"

  --temperature:
    mode: "fixed"
    value: "0.7"

  --enable-tool:
    mode: "blocked"

  --verbose:
    mode: "pass-through"

Policy Modes

allowed

Restricts users to specific values:

--model:
  mode: "allowed"
  allowed: ["gpt-4o", "gpt-4.1"]
  default: "gpt-4.1"

fixed

Locks a flag to a specific value:

--temperature:
  mode: "fixed"
  value: "0.7"

Users cannot override this value.

blocked

Completely prevents users from using a flag:

--enable-tool:
  mode: "blocked"

Any attempt to use this flag will result in an error.

pass-through

Allows any value (default behavior):

--verbose:
  mode: "pass-through"

Global Flags

The global_flags section provides a list of default global flags that are always passed to ostruct, unless overridden by user input:

global_flags:
  - "--model"
  - "gpt-4o-mini"
  - "--temperature"
  - "0.7"
  - "--progress"
  - "none"

Format: A list of strings alternating between flags and their values.

Usage Notes:

  • Flags are always passed to the underlying ostruct run command

  • User-provided flags with allowed or pass-through policies will override these defaults

  • Flags with fixed policies ignore both user input and global_flags defaults

  • Use this for setting consistent tool defaults across template invocations

Example with Policy Interaction:

global_flags:
  - "--model"
  - "gpt-4o-mini"
  - "--temperature"
  - "0.5"

global_args:
  --model:
    mode: "allowed"
    allowed: ["gpt-4o", "gpt-4o-mini"]
    # User can override the global_flags default
  --temperature:
    mode: "fixed"
    value: "0.7"
    # Fixed value ignores global_flags default

Complete Example

Here’s a complete OST template that demonstrates all features:

#!/usr/bin/env -S ostruct runx
---
cli:
  name: text-analyzer
  description: Analyzes text content and extracts insights

  positional:
    - name: input_text
      help: Text to analyze
      default: "Sample text for analysis"

  options:
    format:
      names: ["--format", "-f"]
      help: Output format
      choices: ["json", "yaml", "text"]
      default: "json"

    verbose:
      names: ["--verbose", "-v"]
      help: Enable verbose output
      action: "store_true"

    max_length:
      names: ["--max-length"]
      help: Maximum text length to process
      type: "int"
      default: 1000

    tags:
      names: ["--tag", "-t"]
      help: Add analysis tags (repeatable)
      action: "append"

    config_file:
      names: ["--config"]
      help: Configuration file
      type: "file"
      target: "prompt"

    data_file:
      names: ["--data"]
      help: Data file for Code Interpreter analysis
      type: "file"
      target: "ci"

schema: |
  {
    "type": "object",
    "properties": {
      "analysis": {
        "type": "object",
        "properties": {
          "sentiment": {"type": "string"},
          "key_themes": {
            "type": "array",
            "items": {"type": "string"}
          },
          "word_count": {"type": "integer"},
          "tags": {
            "type": "array",
            "items": {"type": "string"}
          }
        },
        "required": ["sentiment", "key_themes", "word_count"]
      },
      "format": {"type": "string"},
      "verbose": {"type": "boolean"}
    },
    "required": ["analysis", "format", "verbose"]
  }

defaults:
  format: "json"
  verbose: false
  max_length: 1000
  tags: []

global_args:
  pass_through_global: true

  --model:
    mode: "allowed"
    allowed: ["gpt-4o", "gpt-4.1", "o1"]
    default: "gpt-4.1"

  --temperature:
    mode: "fixed"
    value: "0.7"

  --enable-tool:
    mode: "blocked"
---
# Text Analysis Template

Analyze the following text and provide insights:

**Input Text:** {{ input_text }}
**Format:** {{ format }}
**Verbose Mode:** {{ verbose }}
**Max Length:** {{ max_length }}

{% if tags %}
**Analysis Tags:** {{ tags | join(", ") }}
{% endif %}

{% if config_file is defined %}
**Configuration:**
{{ config_file.content }}
{% endif %}

{% if data_file is defined %}
**Data File Available:** {{ data_file.name }}
{% endif %}

{% if verbose %}
Please provide detailed analysis including:
- Sentiment analysis with confidence scores
- Key themes with supporting evidence
- Word count and readability metrics
- Detailed explanations for each finding
{% else %}
Please provide concise analysis including:
- Overall sentiment
- Main themes
- Word count
{% endif %}

Return the analysis in the specified format ({{ format }}).

Usage Examples

Once you’ve created an OST template, you can use it like a native CLI tool:

Basic Usage

# Simple execution
./text-analyzer.ost "This is amazing news!"

# With options
./text-analyzer.ost "Analyze this text" --format yaml --verbose

# With tags
./text-analyzer.ost "Sample text" --tag urgent --tag review

# With files
./text-analyzer.ost "Process this" --config settings.yaml --data report.csv

Help and Debugging

# Get help (automatically generated)
./text-analyzer.ost --help

# Dry run to test without API calls
ostruct runx text-analyzer.ost "test input" --dry-run

# Debug template rendering
ostruct runx text-analyzer.ost "test input" --template-debug vars

Cross-Platform Usage

# Unix/Linux/macOS: Direct execution
./text-analyzer.ost "input text"

# Windows: Via ostruct command
ostruct runx text-analyzer.ost "input text"

# All platforms: Via ostruct command
ostruct runx text-analyzer.ost "input text"

Best Practices

  1. Use Descriptive Names

    # Good
    input_file:
      names: ["--input-file"]
      help: Input file to process
    
    # Avoid
    file:
      names: ["--file"]
      help: File
    
  2. Provide Clear Help Text

    format:
      names: ["--format", "-f"]
      help: Output format (json, yaml, or text)
      choices: ["json", "yaml", "text"]
    
  3. Set Sensible Defaults

    defaults:
      format: "json"
      verbose: false
      max_items: 100
    
  4. Use Appropriate File Targets

    # Configuration files β†’ prompt
    config:
      target: "prompt"
    
    # Data for analysis β†’ ci
    dataset:
      target: "ci"
    
    # Documents for search β†’ fs
    documentation:
      target: "fs"
    
  5. Test with Dry Run

    Always test your templates before live execution:

    ostruct runx my-tool.ost "test input" --dry-run
    
  6. Handle Optional Variables

    {% if config_file is defined %}
    Configuration: {{ config_file.content }}
    {% endif %}
    

Common Patterns

Configuration File Pattern

cli:
  options:
    config:
      names: ["--config", "-c"]
      help: Configuration file
      type: "file"
      target: "prompt"
      default: "config.yaml"

Template usage:

{% if config is defined %}
Configuration settings:
{{ config.content }}
{% endif %}

Data Analysis Pattern

cli:
  options:
    data:
      names: ["--data", "-d"]
      help: Data file for analysis
      type: "file"
      target: "ci"

    output_dir:
      names: ["--output-dir", "-o"]
      help: Output directory for results
      default: "./results"

Multi-Tool Pattern

cli:
  options:
    analysis_data:
      names: ["--data"]
      type: "file"
      target: "ci"  # Code Interpreter

    documentation:
      names: ["--docs"]
      type: "file"
      target: "fs"  # File Search

    config:
      names: ["--config"]
      type: "file"
      target: "prompt"  # Template only

Troubleshooting

Common Issues

Template variables not found:

# Wrong
{{ my_file }}

# Correct
{{ my_file.content }}

Boolean flags not working:

# Wrong
verbose:
  names: ["--verbose"]
  type: "boolean"

# Correct
verbose:
  names: ["--verbose"]
  action: "store_true"

File not accessible:

Check the target specification:

# For template access
config:
  target: "prompt"

# For Code Interpreter
data:
  target: "ci"

Schema validation errors:

Use the Schema Generator tool:

tools/schema-generator/run.sh -o schema.json template.ost

Debug Commands

# Show available variables
ostruct runx my-tool.ost --template-debug vars

# Show template expansion
ostruct runx my-tool.ost --template-debug post-expand

# Dry run with debug
ostruct runx my-tool.ost "test" --dry-run --verbose

See Also

Validation Rules Reference

OST frontmatter is validated according to these rules:

Top-Level Fields

Allowed fields: cli, schema, defaults, global_args, global_flags

Any other top-level fields will result in a validation error.

Required Sections

cli section:
  • Must be a YAML object

  • Must contain name (non-empty string)

  • Must contain description (non-empty string)

schema section:
  • Must be a non-empty string containing valid JSON schema

CLI Section Validation

positional (optional):
  • Must be a list of objects

  • Each positional argument must have a name field (non-empty string)

options (optional):
  • Must be a YAML object or list

  • No specific field validation (handled by Click at runtime)

Global Arguments Validation

global_args location: Can be at top-level OR inside CLI section

pass_through_global field:
  • Must be a boolean value when present

  • Controls whether unknown global flags are allowed

Policy objects: All other fields in global_args must be objects with:
  • Required mode field with value: "fixed", "pass-through", "allowed", or "blocked"

  • Additional fields depend on mode (e.g., value for fixed, allowed list for allowed mode)

global_flags (optional):
  • Must be a list of strings

  • Strings cannot be empty (will cause validation error)

  • Format: alternating flag names and values (["--flag", "value", "--other-flag", "other-value"])

Error Messages

Common validation errors:

# Unknown top-level field
Unknown top-level field 'version'. Allowed fields are: cli, defaults, global_args, global_flags, schema

# Missing required CLI fields
'cli' section is missing required field 'name'
'cli' section is missing required field 'description'

# Invalid schema
'schema' must be a non-empty string

# Invalid global args
Global arg 'model' config must be an object
Global arg 'model' must have 'mode' field
Global arg 'model' mode must be one of: fixed, pass-through, allowed, blocked

Argument Parsing Rules

  • Order matters: Place flags/options before positional arguments to avoid ambiguity.

  • Example: my_tool.ost –format json input.txt (good); my_tool.ost input.txt –format json (may fail if –format is unknown to template).

  • Separator: Use -- to explicitly end flag parsing if needed (e.g., my_tool.ost –format json – input.txt).

This matches behavior in many CLI tools for predictable parsing.

Argument Parsing Tips

  • Recommended: Use –flag=value format for flags with values to avoid order issues (e.g., –progress=basic input.txt).

  • Order: Prefer flags before positionals for best compatibility.

  • Separator: Use – to end flag parsing if needed, but prefer = format for reliability.