Scripting and Cost Control

Advanced automation patterns for ostruct including batch processing, error handling, monitoring, cost optimization, and integration patterns. Learn how to build robust, cost-effective automation workflows that scale with your needs.

Note

These patterns build on the foundation provided in CI/CD and Container Deployment. Adapt examples to your specific infrastructure and requirements.

Batch Processing Patterns

Sequential File Processing

Process multiple files in sequence with comprehensive error handling:

#!/bin/bash
# process_files.sh - Sequential file processor

set -euo pipefail  # Exit on error, undefined vars, pipe failures

# Configuration
TEMPLATE_DIR="./templates"
SCHEMA_DIR="./schemas"
INPUT_DIR="./input"
OUTPUT_DIR="./output"
TIMEOUT=300

# Logging setup
LOG_FILE="./logs/processing_$(date +%Y%m%d_%H%M%S).log"
mkdir -p "$(dirname "$LOG_FILE")"

log() {
    echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG_FILE"
}

# Process a single file
process_file() {
    local input_file="$1"
    local template="$2"
    local schema="$3"
    local output_file="$4"

    log "Processing: $input_file -> $output_file"

    if ostruct run "$template" "$schema" \
        --path-security strict --allow "$PWD" \
        --file config "$input_file" \
        --timeout "$TIMEOUT" \
        --output-file "$output_file"; then
        log "SUCCESS: $output_file"
        return 0
    else
        log "ERROR: Failed to process $input_file"
        return 1
    fi
}

Cost Control and Optimization

Understanding Costs

Choose the appropriate model for your use case:

gpt-4o-mini: - Use for data extraction, classification - Lower cost, good performance - Suitable for most structured data tasks

gpt-4o: - Use for large document processing and complex reasoning - Balance between capability and cost - Good for most production scenarios

gpt-4: - Use for complex reasoning, code analysis - Higher accuracy, higher cost - Best for critical business decisions

Budget Controls

Environment Variables

Set spending limits using environment variables:

# Daily budget limit (in USD)
export OPENAI_DAILY_BUDGET=100.00

# Monthly budget limit
export OPENAI_MONTHLY_BUDGET=2000.00

# Token limit per request
export OPENAI_MAX_TOKENS_PER_REQUEST=2000

Pre-execution Cost Estimation

Estimate costs before processing:

# Dry run to estimate costs
ostruct --dry-run \
  --file data "data.json" \
  --schema schemas/analysis.json \
  --model gpt-4

Configuration-based Limits

Use configuration files for budget controls:

# ostruct.yaml
cost_controls:
  daily_limit: 50.00
  request_limit: 1000
  token_limit: 100000
  alert_threshold: 0.8  # Alert at 80% of limit

models:
  default: gpt-4o-mini
  fallback: gpt-4o-mini  # Fallback if budget exceeded

Template Optimization

Design cost-effective templates:

{# Optimized template #}
Extract key data from:
{{ file_content | truncate(2000) }}

Return JSON with:
- summary (max 100 words)
- entities (list)
- sentiment (positive/negative/neutral)

Schema Design for Cost Control

Use precise schemas to reduce output tokens:

{
  "type": "object",
  "properties": {
    "summary": {
      "type": "string",
      "maxLength": 500
    },
    "categories": {
      "type": "array",
      "maxItems": 5,
      "items": {"type": "string"}
    }
  },
  "required": ["summary", "categories"],
  "additionalProperties": false
}

Best Practices Summary

  1. Cost Management - Use appropriate models for each task - Implement budget controls and monitoring - Optimize templates and schemas - Track usage patterns

  2. Efficient Processing - Batch similar tasks together - Use parallel processing where beneficial - Implement proper error handling - Monitor performance metrics

  3. Production Integration - Use configuration files for settings - Implement proper logging - Set up alerting for issues - Test with dry runs before production

This guide provides the foundation for building cost-effective, scalable ostruct automation workflows.