Detailed Migration Planning Guide#
This guide provides comprehensive questions and considerations for planning a complex Chef-to-Ansible migration. Use this alongside the GitHub issue templates for larger or more complex migration projects.
Table of Contents#
- Chef Environment Assessment
- Target Ansible Architecture
- Technical Migration Challenges
- Organisational & Change Management
- SousChef Integration Strategy
- Success Metrics
Chef Environment Assessment#
Current Deployment Model#
- Chef Server / Chef Zero / Chef Solo?
- Number of managed nodes
- Operating systems in scope
- Multi-datacenter or single environment?
- Geographic distribution of infrastructure
Cookbook Inventory#
Questions to answer:
- How many cookbooks total?
- How many are community cookbooks?
- How many are custom/organisational cookbooks?
-
What's the dependency graph complexity?
-
What types of cookbooks exist?
- Infrastructure cookbooks (networking, monitoring, logging)
- Application cookbooks (web apps, databases, caching)
- Wrapper/orchestration cookbooks
- Policy/enforcement cookbooks
-
Testing/compliance cookbooks (InSpec)
-
Cookbook complexity indicators:
- Average lines of code per recipe?
- Number of custom resources?
- Heavy use of Chef search?
- Data bag dependencies?
- Attribute precedence complexity?
Chef Patterns and Features Used#
Data Management: - Data bags (encrypted or plain?) - Attributes (node, default, override precedence?) - Environments - Roles (how are they structured?) - Berkshelf / Policyfiles for dependency management?
Resource Patterns: - Basic resources (service, package, file, template) - Custom resources / Light-weight Resource Providers (LWRPs) - Ruby blocks for custom logic - Notifications and subscriptions patterns - Guards and conditional execution - Run-time generated resources
Advanced Features: - Chef search (simple lookups vs. complex queries?) - Custom Ohai plugins - Chef handlers for post-run actions - Compliance as code (InSpec integration) - Chef provisioning / machine resources - Test Kitchen integration
Application Patterns: - Deployment resources (deploy, git checkouts) - Source control integrations - File rendering from templates - Application-specific resource types - Multi-tier application orchestration
Target Ansible Architecture#
Deployment Model Decision#
Choose your Ansible deployment model:
Option 1: Ansible Core - Best for: Simple, CLI-driven deployments - Tools: Ansible, ansible-playbook, ad-hoc commands - Complexity: Lower operational overhead - Learning curve: Moderate
Option 2: AWX (Open Source) - Best for: Team collaboration, UI-based management - Tools: AWX UI, REST API, RBAC - Complexity: Moderate operational overhead - Learning curve: Higher, but more discoverable
Option 3: Ansible Automation Platform (Red Hat) - Best for: Enterprise environments, support requirements - Tools: AAP UI, REST API, advanced features - Complexity: Significant operational overhead - Learning curve: Highest, but with support available
Inventory Strategy#
- Static inventory: YAML files, ini files, static configuration
- Dynamic inventory: Cloud providers, CMDB, IPAM
- Hybrid inventory: Static + dynamic sources
- Inventory organisation: Geographic, functional, by app tier?
- Variable scoping: How will you manage group_vars, host_vars?
Variable and Secret Management#
- Variable sources:
- group_vars / host_vars
- Playbook/role defaults
- External variable files
-
AWX/AAP credential stores
-
Secret management:
- Ansible Vault for simple encryption
- External vault systems (HashiCorp Vault)
- Cloud provider secret managers
-
AWX/AAP credential handling
-
Precedence strategy: How will you handle variable precedence conflicts?
Playbook Organisation#
Choose a structure:
Option 1: Playbook per cookbook
playbooks/
web-app.yml # Replaces web-app cookbook
database.yml # Replaces database cookbook
monitoring.yml # Replaces monitoring cookbook
Option 2: Role-based
Option 3: Collection-based (Professional)
collections/
company.infrastructure/
roles/
plugins/
modules/
company.applications/
roles/
playbooks/
Decision factors: - Team expertise level - Project complexity - Code reusability requirements - Maintenance burden
Technical Migration Challenges#
Custom Resources and Providers#
Assessment: - How many custom resources exist? - How complex is their logic? - Are they well-documented?
Migration paths: 1. Direct mapping - If Ansible module exists 2. Custom module - Develop Ansible equivalent 3. Shell/script tasks - Temporary solution 4. Ansible plugin - For complex logic
Example mapping exercise:
# Chef custom resource
custom_resource 'app_deploy' do
property :version, String, required: true
property :config_path, String
action :deploy
end
Possible Ansible equivalents:
# Option 1: Use existing Ansible module
- name: Deploy application
community.app_modules.deploy:
version: "{{ app_version }}"
config: "{{ app_config_path }}"
# Option 2: Create custom Ansible module
- name: Deploy application
my_company.infrastructure.app_deploy:
version: "{{ app_version }}"
config: "{{ app_config_path }}"
# Option 3: Use shell commands (temporary)
- name: Deploy application
shell: |
/opt/app/deploy.sh --version {{ app_version }}
Data Management Complexity#
Chef data bags → Ansible equivalents:
Ansible approaches:
# Option 1: Ansible Vault
# roles/app/defaults/encrypted.yml (encrypted with ansible-vault)
db_password: !vault-encrypted
# Option 2: External secret store
- name: Get secrets from vault
hashicorp.vault.vault_write:
path: secret/data/database
# Option 3: AWX credential store
- name: Use AAP credentials
vars:
db_password: "{{ vault_db_password }}"
Critical questions: - What secrets are currently encrypted? - Where will they stored post-migration? - How will rotation be handled? - What audit/compliance requirements exist?
Chef Search Replacement#
Chef search pattern:
# Find all database nodes
db_nodes = search(:node, 'role:database_master')
db_nodes.each do |node|
# Configure replication
end
Ansible equivalents:
# Option 1: Ansible inventory groups
- name: Configure replication on master
hosts: database_master
tasks:
- name: Setup replication
# ...
# Option 2: Dynamic inventory groups
- name: Get database masters from CMDB
api_query:
url: "{{ cmdb_api }}/nodes?role=database_master"
register: db_masters
# Option 3: Host facts based query
- hosts: all
gather_facts: yes
tasks:
- name: Configure replication on masters
hosts: "{{ groups['all'] | selectattr('ansible_facts.role', 'equalto', 'database_master') }}"
Testing and Validation Strategy#
Chef → Ansible testing migration:
| Chef | Ansible |
|---|---|
| Test Kitchen | Molecule |
| ChefSpec (unit) | Pytest, Testinfra |
| InSpec (compliance) | Ansible Lint, Custom playbooks |
| Foodcritic (style) | Ansible Lint, yamllint |
Migration path: 1. Extract current InSpec tests 2. Convert to Ansible-native testing 3. Integrate into CI/CD 4. Run alongside Chef during transition (parallel testing)
Organisational & Change Management#
Stakeholder Alignment#
Questions to address with stakeholders:
- Business motivation
- Why migrate from Chef to Ansible?
- What are the expected benefits?
-
What are the risks and constraints?
-
Timeline and resources
- How long can migration take?
- How many people are available?
-
What's the critical path?
-
Operational impact
- Can services be interrupted?
- What's the acceptable downtime?
- Who approves changes?
Team Training and Enablement#
Knowledge gap assessment: - How many people know Chef? - How many know Ansible? - What's the learning curve tolerance?
Training path: 1. Ansible basics - Architecture, modules, roles, playbooks 2. Ansible best practices - Code organisation, variable scoping 3. Your specific setup - Custom modules, integrations, workflows 4. Hands-on labs - Practice converting cookbooks 5. Parallel operations - Run Chef and Ansible together during transition
Rollback Strategy#
Plan for reversions: - How do you roll back if Ansible deployment fails? - Can you run Chef and Ansible in parallel? - What's the maximum rollback time? - How will you validate rollback success?
Common approaches: 1. Blue-green deployment - Run both systems, switch traffic 2. Canary deployment - Roll out to subset, monitor, expand 3. Parallel runs - Chef and Ansible together (eventually remove Chef) 4. Feature flags - Gradually enable Ansible-managed services
SousChef Integration Strategy#
Using SousChef for Analysis#
Discovery phase:
# Analyse cookbook structure
souschef-cli structure /path/to/cookbook
# Parse individual recipes
souschef-cli recipe recipes/default.rb
# Extract dependencies
souschef-cli dependencies cookbook/
Output usage: - Understand cookbook complexity - Identify custom resources - Find Chef search patterns - Extract attribute dependencies
Conversion Workflow#
1. Assessment - Use SousChef to analyse current cookbooks - Identify patterns that need custom handling - Plan resource mappings - Document exceptions
2. Initial conversion - Use SousChef tools to generate skeleton playbooks - Review generated code for accuracy - Identify gaps and custom requirements - Iteratively refine
3. Validation - Compare converted playbooks against original - Test in non-production environments - Validate idempotency - Performance testing
4. Refinement - Optimise generated code - Add error handling - Improve structure and readability - Documentation updates
Handling SousChef Limitations#
What SousChef handles well: - Basic Chef resources - Template conversion - Metadata extraction - Recipe structure analysis - Chef Server API integration for dynamic inventory - Local model support (Ollama, llama.cpp, vLLM, LM Studio)
What needs manual work: - Complex custom resource logic - Ad-hoc Ruby blocks - Complex attribute merging - Encrypted data bags - Chef Server authentication (requires environment configuration)
Strategy:
- Use SousChef for initial structure
- Configure environment variables for Chef Server access (CHEF_SERVER_URL, CHEF_ORG, CHEF_CLIENT_NAME, CHEF_CLIENT_KEY_PATH)
- Manually refine complex areas
- Document custom mappings
- Share learnings with team
Success Metrics#
Technical Success Criteria#
Functional equivalence: - [ ] All resources are converted or mapped - [ ] All services start and remain running - [ ] Configuration is applied correctly - [ ] Templating produces same output - [ ] Service health checks pass
Operational parity: - [ ] Performance is equivalent or better - [ ] Security posture maintained - [ ] Compliance requirements met - [ ] Backup/restore procedures work - [ ] Monitoring/alerting intact
Quality metrics: - [ ] Playbook code quality (ansible-lint) - [ ] YAML formatting standards - [ ] Variable naming consistency - [ ] Documentation completeness
Timeline Milestones#
Phase 1: Planning & Preparation (Weeks 1-2) - [ ] Assessment complete - [ ] Architecture designed - [ ] Team trained - [ ] Pilot cookbooks selected - [ ] Success criteria defined
Phase 2: Pilot Migration (Weeks 3-6) - [ ] Convert 2-3 simple cookbooks - [ ] Validate conversions - [ ] Refine processes - [ ] Document learnings - [ ] Get feedback
Phase 3: Scale (Weeks 7-12) - [ ] Convert remaining cookbooks - [ ] Integrate into CI/CD - [ ] Performance testing - [ ] Security validation - [ ] Preparation for rollout
Phase 4: Parallel Running (Weeks 13-16) - [ ] Deploy Ansible alongside Chef - [ ] Run both management systems - [ ] Monitor for discrepancies - [ ] Validate idempotency - [ ] Train operations team
Phase 5: Cutover & Cleanup (Weeks 17-20) - [ ] Switch to Ansible-only - [ ] Monitor for issues - [ ] Decommission Chef - [ ] Archive migration artifacts - [ ] Document final state
Risk Assessment#
High-risk areas for your migration: - Custom resources complexity? - Data bag encryption handling? - Critical business services dependency? - Team Ansible expertise level? - Rollback procedure complexity?
Mitigation strategies: - Extra testing for high-risk areas - Dedicated team for complex conversions - Extended parallel running period - Conservative rollout schedule - Strong monitoring and alerting
Need help? Create a migration support request at: https://github.com/kpeacocke/souschef/issues/new