Quick Start Guideο
Welcome to ostruct! This guide will get you up and running in minutes with a hands-on tutorial featuring Juno the beagle π.
Prerequisitesο
Python 3.10 or higher
OpenAI API key (get one from the OpenAI Platform)
piporpoetryfor installation
Installationο
We provide multiple installation methods. For most users, pipx is recommended as it avoids conflicts with other Python packages.
Install pipx:
python3 -m pip install --user pipx python3 -m pipx ensurepath
(Restart your terminal after running ``ensurepath`` to update your ``PATH``)
Install ostruct-cli:
pipx install ostruct-cli
If youβre on macOS and use Homebrew, you can install ostruct with a single command:
brew install yaniv-golan/ostruct/ostruct-cli
We provide pre-compiled .zip archives for macOS, Windows, and Linux that do not require Python to be installed.
Go to the Latest Release page.
Download the .zip file for your operating system (e.g.,
ostruct-macos-latest.zip,ostruct-windows-latest.zip,ostruct-ubuntu-latest.zip).Extract the .zip file. This will create a folder (e.g.,
ostruct-macos-amd64).On macOS/Linux, make the executable inside the extracted folder runnable:
chmod +x /path/to/ostruct-macos-amd64/ostructRun the executable from within the extracted folder, as it depends on bundled libraries in the same directory.
Run ostruct from our official container image on the GitHub Container Registry.
docker run -it --rm \
-v "$(pwd)":/app \
-w /app \
ghcr.io/yaniv-golan/ostruct:latest \
run template.j2 schema.json --file input input.txt
Set up your OpenAI API key:
# Environment variable
export OPENAI_API_KEY="your-api-key-here"
# Or create a .env file
echo 'OPENAI_API_KEY=your-api-key-here' > .env
Optional Dependenciesο
Enhanced File Type Detectionο
For improved file type detection when using the auto routing target, install the enhanced-detection package:
pip install ostruct-cli[enhanced-detection]
What it does: When you use --file auto:alias file.txt, ostruct needs to determine whether files should be routed to the template (for text files) or treated as binary data. Enhanced detection uses machine learning (Magika) for more accurate file type identification.
Benefits:
More accurate routing: Better detection of file types beyond simple extensions
Handles edge cases: Correctly identifies files without extensions or with misleading extensions
Automatic fallback: Falls back to extension-based detection if unavailable
Without enhanced-detection: ostruct uses extension-based detection for common file types (.txt, .md, .json, .py, .csv, .html, .css, .sql, .sh, .log, .env, and 20+ others).
Note
Alpine Linux: Enhanced detection may not install on Alpine Linux due to compilation requirements. ostruct automatically falls back to extension-based detection with a helpful warning message.
Tutorial: Meet Juno the Beagleο
Letβs start with a real-world example that showcases ostructβs power. Weβll analyze a pet adoption profile and extract structured data.
Step 1: Create the Pet Profileο
First, create a file called juno_profile.txt:
JUNO - BEAGLE MIX LOOKING FOR HOME
Meet Juno! This adorable 3-year-old beagle mix is the perfect companion for
an active family. Juno loves long walks, playing fetch, and snuggling on
the couch after a day of adventures.
Personality: Friendly, energetic, loyal, great with kids
Medical: Fully vaccinated, spayed, microchipped
Training: House-trained, knows basic commands (sit, stay, come)
Ideal Home: Active family with a yard, no cats (she gets too excited!)
Contact the Sunny Valley Animal Shelter to meet Juno today!
Phone: (555) 123-PETS
Email: adopt@sunnyvalley.org
Step 2: Define Your Data Structureο
Create pet_schema.json to specify exactly what information you want to extract:
{
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Pet's name"
},
"breed": {
"type": "string",
"description": "Primary breed"
},
"age": {
"type": "integer",
"description": "Age in years"
},
"personality_traits": {
"type": "array",
"items": {"type": "string"},
"description": "Key personality characteristics"
},
"medical_status": {
"type": "object",
"properties": {
"vaccinated": {"type": "boolean"},
"spayed_neutered": {"type": "boolean"},
"microchipped": {"type": "boolean"}
},
"required": ["vaccinated", "spayed_neutered", "microchipped"]
},
"training_level": {
"type": "array",
"items": {"type": "string"},
"description": "Training achievements"
},
"ideal_home": {
"type": "string",
"description": "Description of ideal living situation"
},
"contact_info": {
"type": "object",
"properties": {
"organization": {"type": "string"},
"phone": {"type": "string"},
"email": {"type": "string"}
},
"required": ["organization"]
}
},
"required": ["name", "breed", "age", "personality_traits", "medical_status"]
}
Tip
Schema Creation Tool: Instead of writing schemas manually, use the Schema Generator meta-tool to automatically create schemas from your templates:
tools/schema-generator/run.sh -o pet_schema.json analyze_pet.j2
This tool analyzes your template and generates OpenAI-compliant schemas automatically.
Step 3: Create the Analysis Templateο
Create analyze_pet.j2 to tell the AI how to process the profile:
---
system_prompt: You are an expert pet adoption coordinator who excels at extracting structured information from adoption profiles.
---
Please analyze this pet adoption profile and extract the key information:
{{ profile.content }}
Extract the information according to the provided schema, ensuring all medical status fields are boolean values and contact information is properly structured.
Tip
Pro Tip: Share system prompts across templates using include_system::
---
include_system: shared/pet_expert.txt
system_prompt: Focus on adoption readiness assessment.
---
See Template Guide for advanced shared prompt techniques.
Step 4: Run the Analysisο
Now use ostruct to extract structured data from Junoβs profile:
ostruct run analyze_pet.j2 pet_schema.json \
--file profile juno_profile.txt \
-m gpt-4o
Result: Youβll get perfectly structured JSON output like this:
{
"name": "Juno",
"breed": "Beagle Mix",
"age": 3,
"personality_traits": ["Friendly", "Energetic", "Loyal", "Great with kids"],
"medical_status": {
"vaccinated": true,
"spayed_neutered": true,
"microchipped": true
},
"training_level": ["House-trained", "Basic commands (sit, stay, come)"],
"ideal_home": "Active family with a yard, no cats",
"contact_info": {
"organization": "Sunny Valley Animal Shelter",
"phone": "(555) 123-PETS",
"email": "adopt@sunnyvalley.org"
}
}
Understanding What Happenedο
Letβs break down the magic:
File Attachment:
--file profile juno_profile.txtattached the text file to template with custom aliasTemplate Processing: The
.j2template combined the profile content with instructionsSchema Validation: The JSON schema ensured the output matched your exact requirements
AI Intelligence: GPT-4o understood the context and extracted the right information
Development Best Practicesο
Always validate with βdry-run first:
Before running any ostruct command for real, validate your template and files:
# 1. Validate everything first - catches errors early
ostruct run analyze_pet.j2 pet_schema.json \
--file profile juno_profile.txt \
--dry-run
# 2. If validation passes, run for real
ostruct run analyze_pet.j2 pet_schema.json \
--file profile juno_profile.txt \
-m gpt-4o
The --dry-run flag performs comprehensive validation including:
Template syntax checking
File access validation (catches binary file issues)
Schema structure validation
Security constraint verification
This saves time and API costs by catching errors before making OpenAI API calls.
Level Up: Multi-Tool Processingο
Ready for more power? Letβs process multiple data sources with different tools.
Advanced Example: Pet Medical Recordsο
Create medical_data.csv:
Date,Procedure,Veterinarian,Notes
2024-01-15,Annual Exam,Dr. Sarah Chen,Healthy weight maintained
2024-01-15,Vaccination Update,Dr. Sarah Chen,DHPP and Rabies boosters
2024-02-20,Spay Surgery,Dr. Michael Torres,Procedure successful
2024-03-10,Microchip Implant,Dr. Sarah Chen,Chip ID: 982000123456789
Create comprehensive_analysis.j2:
---
system_prompt: You are a veterinary data analyst specializing in pet health summaries.
---
Analyze this pet's profile and medical history:
PROFILE:
{{ profile.content }}
MEDICAL RECORDS:
Please analyze the uploaded CSV medical data to extract medical history patterns.
Provide a comprehensive health and adoption readiness assessment.
Create comprehensive_schema.json:
{
"type": "object",
"properties": {
"pet_summary": {
"$ref": "#/$defs/pet_info"
},
"medical_summary": {
"type": "object",
"properties": {
"last_exam_date": {"type": "string", "format": "date"},
"vaccination_status": {"type": "string"},
"procedures_completed": {
"type": "array",
"items": {"type": "string"}
},
"health_status": {"type": "string"},
"microchip_id": {"type": "string"}
}
},
"adoption_readiness": {
"type": "object",
"properties": {
"ready_for_adoption": {"type": "boolean"},
"recommended_followup": {
"type": "array",
"items": {"type": "string"}
}
}
}
},
"$defs": {
"pet_info": {
"type": "object",
"properties": {
"name": {"type": "string"},
"breed": {"type": "string"},
"age": {"type": "integer"}
}
}
}
}
Run the advanced analysis:
ostruct run comprehensive_analysis.j2 comprehensive_schema.json \
--file profile juno_profile.txt \
--file ci:medical medical_data.csv \
-m gpt-4o
Whatβs different?
--file profile juno_profile.txt: Profile text for template access with custom alias--file ci:medical medical_data.csv: Medical data uploaded to Code Interpreter for analysisThe AI can now correlate text descriptions with structured data
Three Learning Pathsο
Choose your adventure based on your needs:
π― Quick Integration (5 minutes)ο
Perfect for developers who need immediate results:
# Basic document analysis
ostruct run template.j2 schema.json --file document document.txt
# With custom variables
ostruct run template.j2 schema.json --file doc doc.txt -V env=prod
# Direct output to file
ostruct run template.j2 schema.json --file data data.txt --output-file result.json
π Data Processing (15 minutes)ο
For analysts working with datasets:
# Analyze CSV with code execution
ostruct run analysis.j2 schema.json --file ci:dataset dataset.csv
# Multi-file processing
ostruct run process.j2 schema.json --file ci:data1 data1.csv --file ci:data2 data2.csv
# Directory processing
ostruct run batch.j2 schema.json --dir ci:data ./data_directory
π Knowledge Extraction (30 minutes)ο
For researchers processing documents:
# Semantic search through documents
ostruct run research.j2 schema.json --file fs:docs documentation.pdf
# Multi-document research
ostruct run synthesis.j2 schema.json --dir fs:papers ./research_papers
# Combined analysis
ostruct run complete.j2 schema.json \
--file config config.yaml \
--file ci:script analysis.py \
--file fs:knowledge knowledge_base.pdf
Key CLI Patterns to Rememberο
- Attachment Syntax
--file alias file.txt(template access with custom alias)--file ci:alias file.txt(Code Interpreter with custom alias)--file fs:alias file.txt(File Search with custom alias)--dir alias ./directory(directory attachment)--collect alias @file-list.txt(file collection from list)
- Tool Targeting
prompt(default): Template access only (configuration, small files)code-interpreterorci: Code Interpreter (data analysis, computation)file-searchorfs: File Search (document retrieval, knowledge bases)--enable-tool web-search: Web Search (current events, real-time data)
- Model Options
-m gpt-4o(default, best for most tasks)-m o1(complex reasoning, slower)-m o3-mini(fast and cost-effective)
- Variables
-V name=value(simple strings)-J config='{"env":"prod"}'(JSON objects)
- Security
--path-security strict(enable strict path validation)--allow /safe/path(allow specific directory)--allow-file /specific/file.txt(allow specific file)
Next Stepsο
- π Learn More
CLI Reference - Complete CLI documentation
Template Guide - Comprehensive template techniques
Security Overview - Security best practices
- π§ Integrate
CI/CD and Container Deployment - CI/CD integration