Skip to content

Intermediate Representation (IR) Module#

The Intermediate Representation (IR) module provides a unified, abstract representation of infrastructure-as-code configurations from various source tools (Chef, Puppet, Salt, Bash, PowerShell) that can be converted to target systems (Ansible, Terraform, CloudFormation).

This document provides comprehensive documentation for the IR module, including data structures, version management, and plugin architecture.

Overview#

The IR module consists of three key components:

  1. IR Schema (schema.py): Core data structures for representing infrastructure
  2. IR Versioning (versioning.py): Version management and schema evolution
  3. Plugin Architecture (plugin.py): Extensible framework for parsers and generators

IR Schema#

Core Concepts#

The IR schema defines a unified representation of infrastructure configurations using a directed acyclic graph (DAG) of nodes, where each node represents a configurable entity or operation.

Data Structures#

IRNodeType#

Enumeration of supported node types used to categorise different infrastructure constructs:

  • RECIPE: Configuration management recipe or routine
  • RESOURCE: Resource definition (package, service, file, etc.)
  • ATTRIBUTE: Configuration attribute or parameter
  • VARIABLE: Variable or constant definition
  • GUARD: Conditional predicate or guard clause
  • HANDLER: Event handler or notification target
  • ACTION: Discrete operation or state change
  • POLICY: Security or compliance policy
  • TEMPLATE: Template or configuration file
  • FILE: File resource
  • PACKAGE: Package resource
  • SERVICE: Service resource
  • USER: User account resource
  • GROUP: User group resource
  • CUSTOM: Custom or user-defined resource type

SourceType & TargetType#

Enumerations defining supported source and target tools:

SourceType: - CHEF: Chef configuration management - PUPPET: Puppet configuration management - SALT: SaltStack configuration management - BASH: Bash shell scripts - POWERSHELL: PowerShell scripts - ANSIBLE: Ansible playbooks (can be both source and target)

TargetType: - ANSIBLE: Ansible playbooks - TERRAFORM: Terraform infrastructure code - CLOUDFORMATION: AWS CloudFormation templates

IRAttribute#

Represents an attribute or parameter with type information:

class IRAttribute:
    name: str                                      # Attribute name
    value: str | int | bool | dict | list | None # Attribute value
    type_hint: str = "any"                         # Type hint for validation
    required: bool = False                         # Whether attribute is required
    default_value: ... = None                      # Default value if not specified
    description: str = ""                          # Human-readable description

IRGuard#

Represents a conditional guard or predicate:

class IRGuard:
    condition: str                  # Condition expression
    type: str = "boolean"          # Condition type (boolean, shell, ruby, etc.)
    negated: bool = False          # Whether condition is negated
    metadata: IRMetadata           # Source tracking metadata

IRAction#

Represents a discrete operation or state change:

class IRAction:
    name: str                              # Action name
    type: str                              # Action type
    attributes: dict[str, IRAttribute]   # Action parameters
    guards: list[IRGuard]                # Conditional guards
    requires: list[str]                  # Node IDs this action depends on
    notifies: list[str]                  # Node IDs to notify after execution
    metadata: IRMetadata                 # Source tracking metadata

IRNode#

Core IR node representing a configurable entity:

class IRNode:
    node_id: str                          # Unique node identifier
    node_type: IRNodeType                # Type of node
    name: str                            # Human-readable name
    source_type: SourceType              # Original source tool
    actions: list[IRAction]              # Operations/actions
    attributes: dict[str, IRAttribute]   # Node parameters
    variables: dict[str, Any]            # Variables/values
    dependencies: list[str]              # Node IDs this depends on
    parent_id: str | None                # Optional parent node
    metadata: IRMetadata                 # Source tracking
    tags: dict[str, str]                 # Arbitrary tags/labels

IRGraph#

Directed acyclic graph of nodes representing complete infrastructure:

class IRGraph:
    graph_id: str                        # Graph identifier
    source_type: SourceType              # Original source tool
    target_type: TargetType              # Target conversion tool
    nodes: dict[str, IRNode]            # Nodes by ID
    metadata: dict[str, Any]            # Graph-level metadata
    created_at: str                      # Creation timestamp
    version: str                         # IR version string

Key Operations#

Dependency Validation#

Validate that all node dependencies reference existing nodes:

graph = IRGraph(...)
unresolved = graph.validate_dependencies()
# Returns: dict[str, list[str]] mapping node IDs to unresolved dependencies

Topological Sorting#

Get nodes in dependency order (respecting all dependencies):

topological_order = graph.get_topological_order()
# Returns: list[str] of node IDs in execution order
# Raises: ValueError if circular dependencies detected

Serialisation#

Convert IR structures to JSON-compatible dictionaries:

node_data = node.to_dict()
action_data = action.to_dict()
graph_data = graph.to_dict()

# Save to file
import json
with open('graph.json', 'w') as f:
    json.dump(graph_data, f, indent=2)

IR Versioning#

The versioning module manages IR schema evolution and compatibility checking.

IRVersion#

Represents a semantic version with comparison operators:

class IRVersion:
    major: int      # Major version (breaking changes)
    minor: int      # Minor version (backward compatible features)
    patch: int      # Patch version (bug fixes)

Version Compatibility: - Versions are compatible if major versions match - Minor and patch versions can differ within same major version - Example: 1.0.0 is compatible with 1.5.3, but not with 2.0.0

Version Operations:

# Parsing
v = IRVersion.parse("1.2.3")

# String representation
print(v)  # "1.2.3"

# Comparison
v1 < v2
v1 <= v2
v1 == v2
v1 != v2
v1 >= v2
v1 > v2

# Compatibility checking
v1.is_compatible_with(v2)  # True if major versions match

SchemaMigration#

Defines a transformation path between two IR schema versions:

class SchemaMigration:
    from_version: IRVersion                    # Starting version
    to_version: IRVersion                      # Target version
    transformation: Callable[[dict], dict]    # Transformation function
    description: str                           # Migration description

IRVersionManager#

Manages version compatibility and schema migrations:

manager = get_version_manager()

# Register migrations
manager.register_migration(SchemaMigration(...))

# Find migration path
migrations = manager.get_migrations_path(IRVersion(1,0,0), IRVersion(1,5,0))

# Migrate data
migrated = manager.migrate_data(data, from_version, to_version)

# Check compatibility
is_compatible = manager.is_version_compatible(version)

# Get version information
info = manager.get_version_info()

Plugin Architecture#

The plugin architecture provides extensibility for adding support for new source tools and target platforms.

SourceParser#

Abstract base class for source tool parsers:

class SourceParser(ABC):
    @property
    @abstractmethod
    def source_type(self) -> SourceType:
        """Return the source tool type."""

    @property
    @abstractmethod
    def supported_versions(self) -> list[str]:
        """Return supported versions of the source tool."""

    @abstractmethod
    def parse(self, source_path: str, **options) -> IRGraph:
        """Parse source configuration into IR."""

    @abstractmethod
    def validate(self, source_path: str) -> dict:
        """Validate source configuration without parsing."""

    def get_metadata(self) -> dict[str, str]:
        """Get parser metadata."""

TargetGenerator#

Abstract base class for target platform generators:

class TargetGenerator(ABC):
    @property
    @abstractmethod
    def target_type(self) -> TargetType:
        """Return the target system type."""

    @property
    @abstractmethod
    def supported_versions(self) -> list[str]:
        """Return supported versions of the target system."""

    @abstractmethod
    def generate(self, graph: IRGraph, output_path: str, **options) -> None:
        """Generate target configuration from IR."""

    @abstractmethod
    def validate_ir(self, graph: IRGraph) -> dict:
        """Validate IR for compatibility with this target."""

    def get_metadata(self) -> dict[str, str]:
        """Get generator metadata."""

PluginRegistry#

Central registry for managing plugins:

registry = get_plugin_registry()

# Register plugins
registry.register_parser(SourceType.CHEF, ChefParser)
registry.register_generator(TargetType.ANSIBLE, AnsibleGenerator)

# Retrieve plugins
parser = registry.get_parser(SourceType.CHEF)
generator = registry.get_generator(TargetType.ANSIBLE)

# List available plugins
parsers = registry.get_available_parsers()
generators = registry.get_available_generators()

# Get registry information
info = registry.get_registry_info()

Implementing Custom Parsers#

To add support for a new source tool:

from souschef.ir import SourceParser, SourceType, IRGraph

class CustomToolParser(SourceParser):
    @property
    def source_type(self) -> SourceType:
        return SourceType.CUSTOM_TOOL

    @property
    def supported_versions(self) -> list[str]:
        return ["1.0", "2.0", "3.0"]

    def parse(self, source_path: str, **options) -> IRGraph:
        # Parse source configuration and build IR graph
        graph = IRGraph(
            graph_id="custom-tool-parse",
            source_type=self.source_type,
            target_type=TargetType.ANSIBLE,
            version="1.0.0",
        )

        # Add nodes to graph
        # ... parsing logic ...

        return graph

    def validate(self, source_path: str) -> dict:
        # Validate source configuration
        return {
            "valid": True,
            "errors": [],
            "warnings": [],
        }

# Register the parser
registry = get_plugin_registry()
registry.register_parser(SourceType.CUSTOM_TOOL, CustomToolParser)

Implementing Custom Generators#

To add support for a new target platform:

from souschef.ir import TargetGenerator, TargetType, IRGraph

class CustomTargetGenerator(TargetGenerator):
    @property
    def target_type(self) -> TargetType:
        return TargetType.CUSTOM_TARGET

    @property
    def supported_versions(self) -> list[str]:
        return ["1.0", "2.0"]

    def generate(self, graph: IRGraph, output_path: str, **options) -> None:
        # Generate target configuration from IR graph
        # ... generation logic ...
        pass

    def validate_ir(self, graph: IRGraph) -> dict:
        # Validate IR for compatibility
        return {
            "compatible": True,
            "issues": [],
            "warnings": [],
        }

# Register the generator
registry = get_plugin_registry()
registry.register_generator(TargetType.CUSTOM_TARGET, CustomTargetGenerator)

Best Practices#

IR Graph Design#

  1. Use unique node IDs: Ensure all nodes have globally unique identifiers within a graph
  2. Keep dependencies acyclic: Avoid circular dependencies which will be detected and raise errors
  3. Set metadata: Track source file and line number for better error reporting
  4. Use appropriate node types: Choose node types that accurately reflect the infrastructure construct

Dependency Management#

  1. Explicit dependencies: Always explicitly declare dependencies between nodes
  2. Avoid implicit ordering: Don't rely on iteration order; use topological sorting for correct order
  3. Test circular detection: Verify circular dependency detection works for your use cases

Version Compatibility#

  1. Semantic versioning: Follow semantic versioning conventions for IR schema versions
  2. Test migrations: Ensure schema migrations are tested thoroughly before deployment
  3. Document changes: Document all schema changes and migration paths

Example Workflow#

from souschef.ir import (
    IRGraph,
    IRNode,
    IRNodeType,
    IRAction,
    IRAttribute,
    SourceType,
    TargetType,
    get_plugin_registry,
)

# Create IR graph
graph = IRGraph(
    graph_id="example-001",
    source_type=SourceType.CHEF,
    target_type=TargetType.ANSIBLE,
    version="1.0.0",
)

# Add nodes
node1 = IRNode(
    node_id="install-nginx",
    node_type=IRNodeType.PACKAGE,
    name="Install nginx",
    source_type=SourceType.CHEF,
)
graph.add_node(node1)

node2 = IRNode(
    node_id="start-nginx",
    node_type=IRNodeType.SERVICE,
    name="Start nginx service",
    source_type=SourceType.CHEF,
)
node2.add_dependency("install-nginx")
graph.add_node(node2)

# Validate dependencies
unresolved = graph.validate_dependencies()
assert len(unresolved) == 0, "Unresolved dependencies found"

# Get topological order
order = graph.get_topological_order()
print(f"Execution order: {order}")
# Output: Execution order: ['install-nginx', 'start-nginx']

# Use plugin registry to convert
registry = get_plugin_registry()
parser = registry.get_parser(SourceType.CHEF)
generator = registry.get_generator(TargetType.ANSIBLE)

# Generate output
generator.generate(graph, output_path="./playbook.yml")

Testing#

The IR module includes comprehensive testing:

  • Unit tests: Test IR schema, versioning, and plugin architecture
  • Integration tests: Test real-world scenarios with complex dependencies
  • Property-based tests: Fuzz testing with Hypothesis to ensure robustness

Run tests:

# Unit tests
poetry run pytest tests/unit/test_ir_schema.py
poetry run pytest tests/unit/test_ir_property_based.py

# Integration tests
poetry run pytest tests/integration/test_ir_integration.py

# All tests with coverage
poetry run pytest tests/ --cov=souschef.ir