ostruct-cliο
ostruct-cli is a command-line tool for generating structured output from OpenAI models. It combines the power of OpenAIβs language models with the reliability of JSON Schema validation to ensure consistent, well-structured responses.
Key Featuresο
Schema-First Approach: Define your output structure using JSON Schema (validation is always performed automatically)
Template-Based Input: Use Jinja2 templates with support for YAML frontmatter, system prompts, and shared system prompt includes
Multi-Tool Integration: Native support for Code Interpreter, File Search, Web Search, and MCP servers
Development Tools: Built-in meta-tools for schema generation and template analysis
File Processing: Handle single files, multiple files, or entire directories with thread-safe operations
Cross-Platform: Robust support for Windows, macOS, and Linux with consistent path handling
Security-Focused: Safe file access with explicit directory permissions and enhanced error handling
Structured Output: Guaranteed valid JSON output matching your schema
Token Management: Automatic token limit validation and handling
Model Support: Optimized handling for both streaming and non-streaming models
Quick Startο
Install the package:
pip install ostruct-cli
For enhanced file type detection (optional):
pip install ostruct-cli[enhanced-detection]
Define your schema (
schema.json):{ "type": "object", "properties": { "summary": { "type": "string", "description": "Brief summary of the content" }, "topics": { "type": "array", "items": {"type": "string"}, "description": "Main topics covered" } }, "required": ["summary", "topics"] }
Create a task template (
task.j2):--- system_prompt: You are an expert content analyzer. --- Analyze this content and extract key information: {{ content.content }}Run the analysis:
ostruct run task.j2 schema.json \ --file content input.txt \ -m gpt-4o
Documentationο
π Getting Started
- Introduction to ostruct
- Quick Start Guide
- Data Science Integration Guide
- Overview
- Jupyter/Colab Integration
- Interactive Jupyter Notebook Example
- Multi-Tool Data Science Workflows
- Comprehensive Multi-Tool Workflow Patterns
- Data Science Schema Templates
- Practical Examples and Use Cases
- Token Management for Large Datasets
- Error Handling and Troubleshooting
- Performance Optimization
- Practical Examples and Use Cases
- Integration with Data Science Tools
- Next Steps
- See Also
- Examples and Use Cases
- Development Tools
- Available Examples
- Getting Started with Examples
- Contributing Examples
- Next Steps
π Templates
- Template Guide
- Understanding Templates
- Variables and Data Access
- File Variables
- File Attachment System
- Directory Variables
- Literal and JSON Variables
- Template Filters
- Template Functions
- Control Structures
- YAML Frontmatter
- Template Debugging
- Best Practices
- Real-World Examples
- See Also
- Template Quick Reference
- Template Structure
- Essential Syntax
- File Variables
- Tool Variables
- Standard Input
- CLI Variables
- Essential Filters
- Common Patterns
- Global Functions
- File Operations
- Common Issues
- CLI Examples
- Advanced Template Patterns
- Template Organization
- Multi-Tool Integration Patterns
- Remote PDF Analysis with Vision Models
- Dynamic Content Generation
- Performance Optimization
- Error Handling and Robustness
- Template Composition Patterns
- Best Practices Summary
- See Also
π§ Reference
- CLI Reference
- OST (Self-Executing Templates) Commands
- Template Scaffolding Commands
- Environment Setup Commands
- Attachment System
- Usage Examples
- File Attachment Helpers
- File Type Limitations
- Other Options
- Troubleshooting
- Files Management Commands
- See Also
- OST Front-matter Reference Guide
- What are OST Files?
- OST File Structure
- Required Sections
- CLI Configuration
- Action Parameters
- CLI-Level Global Arguments
- File Routing Targets
- Validation and Choices
- Default Values
- Global Arguments Policy
- Global Flags
- Complete Example
- Usage Examples
- Best Practices
- Common Patterns
- Troubleshooting
- See Also
- Validation Rules Reference
- Argument Parsing Rules
- Argument Parsing Tips
- Upload Cache Guide
- Error Handling
- Gitignore Support Guide
- Multi-Tool Integration
- Code Interpreter
- File Search
- Web Search
- MCP Servers
- Multi-Tool Workflows
- Configuration Management
- Troubleshooting
- See Also
π Automation
- CI/CD and Container Deployment
- CI/CD Integration
- GitHub Actions
- GitLab CI
- Container Deployment
- Docker Fundamentals
- Creating Custom Docker Images
- Docker Compose Deployments
- Kubernetes Deployment
- Security Best Practices
- Best Practices Summary
- Scripting and Cost Control
- Batch Processing Patterns
- Cost Control and Optimization
- Budget Controls
- Best Practices Summary
π οΈ Contributing
π Security
- Security Overview
- Security Architecture
- API Key Management
- File Access Control
- URL Validation & Remote Attachments
- User-Data (Vision Model) Uploads
- Data Upload and Tool Security
- MCP Server Security
- Threat Model and Risk Assessment
- Production Security Checklist
- Security Configuration Examples
- Security Resources
- Getting Security Help
Why Structured Output?ο
Structured output offers several advantages:
Reliability: Schema validation ensures responses match your requirements
Consistency: Get the same structure every time, making responses easier to process
Integration: JSON output works seamlessly with other tools and systems
Validation: Catch and handle invalid responses before they reach your application
Handling Large Filesο
When working with large files, the CLI provides several features to help:
Token Validation: Automatically validates token usage against model limits
Prompt Structure: Recommends placing content at the end with clear delimiters
Dry Run: Preview token usage before making API calls (note: βdebug-openai-stream wonβt show streaming data during dry runs)
Progress Reporting: Track processing status for large operations
See the CLI documentation for detailed guidance on handling large files.
Requirementsο
Python 3.10 or higher
OpenAI API key
Loggingο
ostruct writes logs to ~/.ostruct/logs/ for debugging and monitoring. Use --verbose for detailed logging or configure via environment variables. See the CLI Reference for complete logging configuration options.
Supportο
GitHub Issues: https://github.com/yaniv-golan/ostruct/issues
Documentation: https://ostruct.readthedocs.io/
CLI Interfaceο
The CLI revolves around a single subcommand called run. Basic usage:
ostruct run <TASK_TEMPLATE> <SCHEMA_FILE> [OPTIONS]
Key Features:
File routing:
--file ci:data file.csv(Code Interpreter),--file fs:docs manual.pdf(File Search)Multi-tool integration: Web Search, Code Interpreter, File Search, MCP Servers
Template variables:
-V name=valuefor simple variables,-J name='{"key":"value"}'for JSONModel configuration:
--model gpt-4o --temperature 0.7Debugging:
--dry-run,--template-debug vars,--verbose
For complete CLI documentation, see the CLI Reference.