Style Guide

This document outlines coding standards, documentation guidelines, and best practices for contributing to ostruct. Following these guidelines ensures consistency and maintainability across the codebase.

Python Coding Standards

Code Formatting

We use Black for code formatting with these settings:

# pyproject.toml
[tool.black]
line-length = 79
target-version = ["py310"]
include = '\.pyi?$'
extend-exclude = '''
/(
  # Exclude generated files
  \.eggs
  | \.git
  | \.venv
  | _build
  | build
  | dist
)/
'''

Key formatting rules:

  • Maximum line length: 79 characters

  • Use double quotes for strings

  • Trailing commas in multi-line structures

  • Consistent indentation (4 spaces)

# Good
def process_files(
    template_path: str,
    schema_path: str,
    output_file: Optional[str] = None,
) -> ProcessingResult:
    """Process files with the given template and schema."""
    return ProcessingResult(
        success=True,
        message="Processing completed successfully",
    )

Import Organization

Use isort for import sorting with these guidelines:

# Standard library imports
import asyncio
import json
from pathlib import Path
from typing import Dict, List, Optional

# Third-party imports
import click
import jinja2
from pydantic import BaseModel

# Local imports
from ostruct.cli.errors import ValidationError
from ostruct.cli.security import SecurityManager

Import ordering:

  1. Standard library imports

  2. Third-party library imports

  3. Local application imports

Import style:

  • Use absolute imports when possible

  • Group imports logically

  • Avoid wildcard imports (from module import *)

  • Use explicit imports for clarity

Type Hints

Use comprehensive type hints for better code clarity:

from typing import Dict, List, Optional, Union, Any
from pathlib import Path

# Function signatures
def validate_config(
    config: Dict[str, Any],
    strict_mode: bool = False
) -> ValidationResult:
    """Validate configuration with optional strict mode."""

# Class definitions
class TemplateRenderer:
    """Renders Jinja2 templates with file content access."""

    def __init__(self, template_dir: Path) -> None:
        self.template_dir = template_dir
        self._cache: Dict[str, str] = {}

# Complex types
FileMapping = Dict[str, Union[str, Path]]
ConfigDict = Dict[str, Union[str, int, bool, List[str]]]

Variable Naming

Follow Python naming conventions:

# Constants (module level)
MAX_FILE_SIZE = 10 * 1024 * 1024  # 10MB
DEFAULT_MODEL = "gpt-4o"

# Functions and variables
def process_template_file(template_path: Path) -> str:
    file_content = template_path.read_text()
    processed_content = transform_content(file_content)
    return processed_content

# Classes
class SecurityManager:
    """Manages file access security and validation."""

# Private methods/variables
class TemplateProcessor:
    def __init__(self):
        self._template_cache = {}

    def _validate_template(self, content: str) -> bool:
        return bool(content.strip())

Naming guidelines:

  • Use descriptive names that explain purpose

  • Avoid abbreviations unless widely understood

  • Use snake_case for functions and variables

  • Use PascalCase for classes

  • Use SCREAMING_SNAKE_CASE for constants

  • Prefix private members with underscore

Docstrings

Use comprehensive docstrings following Google style:

def render_template(
    template_path: Path,
    context: Dict[str, Any],
    strict_mode: bool = False
) -> str:
    """Render a Jinja2 template with the provided context.

    This function loads a Jinja2 template from the specified path and
    renders it with the given context variables. It supports both
    regular and strict rendering modes.

    Args:
        template_path: Path to the Jinja2 template file.
        context: Dictionary containing template variables.
        strict_mode: If True, raises errors for undefined variables.
            Defaults to False.

    Returns:
        The rendered template as a string.

    Raises:
        TemplateNotFoundError: If the template file doesn't exist.
        TemplateRenderError: If template rendering fails.
        ValidationError: If template contains invalid syntax.

    Example:
        >>> context = {"name": "ostruct", "version": "1.0.0"}
        >>> result = render_template(Path("template.j2"), context)
        >>> print(result)
        Hello, ostruct v1.0.0!
    """

Docstring guidelines:

  • Start with a one-line summary

  • Include detailed description if needed

  • Document all parameters and return values

  • List all possible exceptions

  • Provide usage examples for complex functions

  • Use present tense (β€œReturns the result” not β€œWill return”)

Error Handling

Exception Hierarchy

Use the structured exception hierarchy from cli/errors.py:

from ostruct.cli.errors import (
    CLIError,              # Base exception
    VariableError,         # Variable-related errors
    TaskTemplateError,     # Template processing errors
    PathSecurityError,     # Security-related errors
    SchemaError,          # Schema validation errors
)
from ostruct.cli.base_errors import OstructFileNotFoundError

# Good - specific exception types
def validate_file_path(path: str) -> Path:
    """Validate and return a Path object."""
    if not path:
        raise VariableError("File path cannot be empty")

    try:
        file_path = Path(path).resolve()
    except (OSError, ValueError) as e:
        raise PathSecurityError(f"Invalid file path '{path}': {e}")

    if not file_path.exists():
        raise OstructFileNotFoundError(path)

    return file_path

Error Messages

Provide clear, actionable error messages:

# Good - specific and actionable
raise TaskTemplateError(
    f"Template file '{template_path}' is too large "
    f"({file_size} bytes). Maximum allowed size is {MAX_SIZE} bytes. "
    f"Consider splitting the template into smaller files."
)

# Bad - vague and unhelpful
raise TaskTemplateError("Template error")

Error message guidelines:

  • Include relevant context (file names, values)

  • Suggest solutions when possible

  • Use consistent terminology

  • Avoid technical jargon in user-facing messages

  • Include specific limits or constraints

Logging

Use structured logging with appropriate levels:

import logging

logger = logging.getLogger(__name__)

def process_files(files: List[Path]) -> None:
    """Process multiple files with proper logging."""
    logger.info(f"Starting to process {len(files)} files")

    for file_path in files:
        logger.debug(f"Processing file: {file_path}")

        try:
            result = process_single_file(file_path)
            logger.info(f"Successfully processed {file_path}")
        except Exception as e:
            logger.error(
                f"Failed to process {file_path}: {e}",
                exc_info=True
            )
            raise

Logging guidelines:

  • Use appropriate log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL)

  • Include context in log messages

  • Never log sensitive information (API keys, file contents)

  • Use structured logging for machine-readable logs

  • Log performance metrics for optimization

Security Guidelines

Input Validation

Validate all inputs through the security layer:

from ostruct.cli.security import SecurityManager

def process_user_files(file_paths: List[str]) -> None:
    """Process user-provided files with security validation."""
    security_manager = SecurityManager()

    for file_path in file_paths:
        # Always validate through security manager
        validated_path = security_manager.validate_file_access(file_path)

        # Process the validated path
        with validated_path.open('r') as f:
            content = f.read()
            process_content(content)

Path Handling

Use secure path operations:

from ostruct.cli.security import safe_join
from ostruct.cli.security.symlink_resolver import _resolve_symlink

def build_safe_path(base_dir: Path, user_path: str) -> Path:
    """Build a safe path within the base directory."""
    # Use safe_join to prevent directory traversal
    joined_path = safe_join(str(base_dir), user_path)
    if joined_path is None:
        raise PathSecurityError(f"Unsafe path: {user_path}")

    # Resolve symlinks securely
    resolved_path = _resolve_symlink(
        Path(joined_path),
        max_depth=16,
        allowed_dirs=[base_dir]
    )

    return resolved_path

Security practices:

  • Never trust user input directly

  • Use path normalization and validation

  • Resolve symlinks securely

  • Implement proper access controls

  • Log security events for auditing

Testing Standards

Test Organization

Organize tests by functionality and scope:

import pytest
from unittest.mock import patch, MagicMock
from pathlib import Path

from ostruct.cli.template_utils import render_template
from ostruct.cli.errors import TaskTemplateError


class TestTemplateRendering:
    """Tests for template rendering functionality."""

    @pytest.fixture
    def sample_template(self):
        """Create a sample template string."""
        return "Hello, {{ name }}!"

    @pytest.fixture
    def template_env(self):
        """Create a Jinja2 environment."""
        import jinja2
        return jinja2.Environment()

    def test_render_simple_template(self, sample_template, template_env):
        """Test rendering a simple template."""
        context = {"name": "World"}
        result = render_template(sample_template, context, template_env)
        assert result == "Hello, World!"

    def test_render_missing_variable(self, template_env):
        """Test error handling for missing variables."""
        template = "Hello, {{ missing_var }}!"
        with pytest.raises(TaskTemplateError):
            render_template(template, {}, template_env)

Test Coverage

Aim for comprehensive test coverage:

# Test happy path
def test_successful_processing(self):
    """Test successful file processing."""

# Test error conditions
def test_invalid_input_raises_error(self):
    """Test that invalid input raises appropriate error."""

# Test edge cases
def test_empty_file_handling(self):
    """Test handling of empty files."""

# Test boundary conditions
def test_maximum_file_size_limit(self):
    """Test behavior at maximum file size limit."""

Coverage guidelines:

  • Test both success and failure paths

  • Include edge cases and boundary conditions

  • Mock external dependencies (API calls, file system)

  • Use parameterized tests for multiple scenarios

  • Maintain at least 90% test coverage

Mocking Best Practices

Use mocking effectively for external dependencies:

@patch('ostruct.cli.runner.AsyncOpenAI')
def test_responses_api_call_success(self, mock_openai):
    """Test successful Responses API call with mocked client."""
    # Setup mock
    mock_client = AsyncMock()
    mock_openai.return_value = mock_client

    # Create mock streaming response for Responses API
    async def mock_stream():
        yield Mock(response=Mock(body='{"result": "test response"}'))

    mock_client.responses.create.return_value = mock_stream()

    # Test implementation
    result = await stream_structured_output(
        client=mock_client,
        model="gpt-4o",
        system_prompt="System prompt",
        user_prompt="User prompt",
        output_schema=TestSchema
    )

    # Verify behavior
    mock_client.responses.create.assert_called_once()
    # Verify API parameters
    call_args = mock_client.responses.create.call_args
    assert call_args[1]["model"] == "gpt-4o"
    assert "input" in call_args[1]  # Responses API uses 'input' not 'messages'
    assert "text" in call_args[1]   # Responses API uses 'text' format

Documentation Standards

reStructuredText Guidelines

Use consistent reStructuredText formatting:

===================
Chapter Title
===================

Section Title
=============

Subsection Title
----------------

Subsubsection Title
^^^^^^^^^^^^^^^^^^^

**Key formatting rules:**

- Use underlines of the same length as the title
- Maintain consistent heading hierarchy
- Include table of contents for long documents
- Use proper cross-references

Code Examples

Include comprehensive code examples:

.. code-block:: bash

   # Example command with explanation
   ostruct run template.j2 schema.json \
     --file ci:data source_code/ \
     --file fs:docs documentation/ \
     --file config config.yaml

.. code-block:: python

   # Python code example
   from ostruct.cli import TemplateRenderer

   renderer = TemplateRenderer()
   result = renderer.render_template(template, context)

Documentation guidelines:

  • Include working examples that can be copied and pasted

  • Explain complex concepts with simple examples

  • Use consistent terminology throughout

  • Link to related sections and external resources

  • Keep examples up to date with current API

API Documentation

Document all public APIs comprehensively:

def render_template_with_context(
    template_content: str,
    context: Dict[str, Any],
    env: jinja2.Environment
) -> str:
    """Render a Jinja2 template with the provided context.

    This function provides a secure interface for rendering
    Jinja2 templates with file content access. It includes built-in
    security validation and optimization.

    Args:
        template_content: The template content as a string.
        context: Dictionary containing template variables.
        env: Jinja2 environment for rendering.

    Returns:
        The rendered template as a string.

    Raises:
        TaskTemplateError: If template rendering fails.

    Example:
        >>> import jinja2
        >>> env = jinja2.Environment()
        >>> context = {"name": "ostruct", "version": "1.0.0"}
        >>> result = render_template_with_context(
        ...     "Hello, {{ name }} v{{ version }}!", context, env
        ... )
        >>> print(result)
        Hello, ostruct v1.0.0!
    """

Performance Guidelines

Async/Await Usage

Use async/await for I/O operations:

import asyncio
from typing import List

async def process_files_async(file_paths: List[Path]) -> List[str]:
    """Process multiple files asynchronously."""
    tasks = [process_single_file(path) for path in file_paths]
    results = await asyncio.gather(*tasks, return_exceptions=True)

    # Handle results and exceptions
    processed_results = []
    for result in results:
        if isinstance(result, Exception):
            logger.error(f"Processing failed: {result}")
        else:
            processed_results.append(result)

    return processed_results

Caching Strategies

Implement appropriate caching for performance:

from ostruct.cli.cache_manager import FileCache

class TemplateManager:
    """Template management with file caching."""

    def __init__(self, max_cache_size: int = 50 * 1024 * 1024):
        self._file_cache = FileCache(max_cache_size)

    def get_template_content(self, template_path: str) -> str:
        """Get template content with caching."""
        from pathlib import Path

        path = Path(template_path)
        stat = path.stat()

        # Try to get from cache
        cached = self._file_cache.get(
            str(path.absolute()),
            stat.st_mtime_ns,
            stat.st_size
        )

        if cached:
            return cached.content

        # Read and cache
        content = path.read_text(encoding='utf-8')
        self._file_cache.put(
            str(path.absolute()),
            content,
            'utf-8',
            None,  # hash_value
            stat.st_mtime_ns,
            stat.st_size
        )

        return content

Performance practices:

  • Cache expensive operations (template compilation, file reads)

  • Use appropriate data structures for performance

  • Profile code to identify bottlenecks

  • Optimize token usage for API calls

  • Implement lazy loading where appropriate

Git Workflow

Commit Messages

Write clear, descriptive commit messages:

# Good commit message format
Add template caching to improve rendering performance

- Implement LRU cache for compiled templates
- Add cache size configuration option
- Include cache hit/miss metrics in logging
- Update documentation with caching behavior

Fixes #123

Commit message guidelines:

  • Use imperative mood (β€œAdd feature” not β€œAdded feature”)

  • Keep first line under 50 characters

  • Include detailed description if needed

  • Reference issue numbers

  • Explain the β€œwhy” not just the β€œwhat”

Branch Naming

Use descriptive branch names:

# Feature branches
feature/template-caching
feature/multi-tool-support

# Bug fixes
fix/security-validation-error
fix/template-rendering-crash

# Documentation
docs/api-reference-update
docs/contribution-guidelines

Pull Request Guidelines

Create comprehensive pull requests:

  1. Clear title and description

  2. Link to related issues

  3. Include testing notes

  4. Update documentation as needed

  5. Ensure CI passes

Review Checklist

Before submitting code, verify:

Code Quality

  • [ ] Code follows formatting standards (Black, isort)

  • [ ] No linting errors (Flake8)

  • [ ] Type hints are comprehensive (MyPy)

  • [ ] Docstrings are complete and accurate

  • [ ] Error handling is appropriate

Security

  • [ ] All inputs are validated through security layer

  • [ ] No sensitive information in logs

  • [ ] Path handling uses security utilities

  • [ ] Access controls are properly implemented

Testing

  • [ ] All tests pass

  • [ ] New functionality has comprehensive tests

  • [ ] Edge cases are covered

  • [ ] Mock dependencies appropriately

Documentation

  • [ ] Public APIs are documented

  • [ ] Examples are working and current

  • [ ] Documentation builds without warnings

  • [ ] Cross-references are valid

Getting Help

If you have questions about coding standards:

  1. Check existing code for examples

  2. Review this style guide for clarification

  3. Ask in GitHub discussions for guidance

  4. Submit a draft PR for early feedback

Remember: Consistency is more important than personal preference. When in doubt, follow the existing patterns in the codebase.