The Hotstack SnapSet feature enables creating consistent snapshots of OpenStack instances (virtual machines) in a running Hotstack deployment. This feature is particularly useful for:
- Development and Testing: Create snapshots of fully deployed OpenShift clusters to quickly restore to a known good state
- CI/CD Pipelines: Reduce deployment time by starting from pre-configured snapshots instead of deploying from scratch
- Overview
- What is a SnapSet?
- How SnapSet Works
- Creating a SnapSet
- SnapSet Process Details
- Using SnapSet Images
- SnapSet Data Structure
- Example: Complete SnapSet Workflow
- Limitations
- Related Documentation
A SnapSet is a collection of OpenStack images created from running instances at a specific point in time. Each SnapSet contains:
- Controller Node Image: Snapshot of the Hotstack controller instance
- OpenShift Node Images: Snapshots of OpenShift master/worker nodes
- Metadata: Information about each instance including role, MAC addresses, and unique identifiers
All images in a SnapSet are tagged with a unique identifier and metadata for easy identification and management.
The SnapSet creation process involves several steps:
- Cluster Preparation: OpenShift nodes are cordoned (marked unschedulable) to prevent new workloads
- Graceful Shutdown: All instances are gracefully shut down
- Image Creation: OpenStack images are created from each stopped instance
- Metadata Tagging: Each image is tagged with relevant metadata for identification
- Parallel Processing: Multiple images are created concurrently for efficiency
The process ensures data consistency by stopping all instances before creating snapshots.
OpenShift 4 clusters have a critical requirement related to certificate rotation that affects when snapshots can be safely created. When a cluster is installed, a bootstrap certificate is created for kubelet client certificate requests. This bootstrap certificate expires after 24 hours and cannot be renewed.
If a cluster is shut down before the initial 24-hour certificate rotation completes and the 30-day client certificates are issued, the cluster becomes unusable when restarted because the expired bootstrap certificate cannot authenticate the kubelets.
This is why Hotstack waits 25 hours before creating snapshots - to ensure the certificate rotation has completed and the cluster can be safely shut down and restarted without requiring manual certificate signing request (CSR) approval or other workarounds.
For more technical details, see the Red Hat blog post on enabling OpenShift 4 clusters to stop and resume.
The simplest way to create a SnapSet is using the provided playbook and scenario:
ansible-playbook -i inventory.yml create-snapset.yml \
-e @scenarios/snapshot-sno/bootstrap_vars.yml \
-e @~/bootstrap_vars_overrides.yml \
-e @~/cloud-secrets.yamlThis playbook performs the complete workflow:
- Infrastructure Setup (
01-infra.yml) - Controller Bootstrap (
02-bootstrap_controller.yml) - OpenShift Installation (
03-install_ocp.yml) - with snapshot preparation - SnapSet Creation (
04-create-snapset.yml)
You can also run playbooks individually for more control:
# Only create snapset from existing deployment
ansible-playbook -i inventory.yml 04-create-snapset.yml \
-e @scenarios/snapshot-sno/bootstrap_vars.yml \
-e @~/bootstrap_vars_overrides.yml \
-e @~/cloud-secrets.yamlKey configuration variables for SnapSet creation:
# Enable snapshot preparation during OCP installation
hotstack_prepare_for_snapshot: trueWhen hotstack_prepare_for_snapshot is enabled:
- Bootstrap Certificate Wait: The system waits 25 hours to allow OpenShift's bootstrap certificate rotation to complete. This ensures that 30-day client certificates are properly issued, eliminating the need for manual certificate signing request (CSR) approval or daemonset workarounds when the cluster is later restored from snapshots.
- Cluster Stabilization: Waits for cluster to be stable for the specified period
- Node Cordoning: Marks all nodes as unschedulable
- Graceful Shutdown: Shuts down all OpenShift nodes
The hotstack_snapset Ansible module:
- Validates Input: Ensures all required instance data is provided
- Checks Instance States: Verifies all instances are in SHUTOFF state
- Creates Images: Parallel creation of OpenStack images from instances
- Tags Images: Adds metadata tags to each created image
Created images follow the naming convention:
hotstack-{instance_name}-snapshot-{unique_id}
Each image is tagged with:
hotstack: General Hotstack identifierhotstack-snapset: Hotstack SnapSet identifiername={name}: Instance namerole={role}: Instance role (controller, ocp_master, etc.)snap_id={unique_id}: Unique snapshot set identifiermac_address={mac}: Original MAC address
List available snapset images:
openstack image list --tag hotstack-snapsetFind images from a specific snapset:
openstack image list --tag snap_id=AbCdEfTo restore an environment from a SnapSet:
- Update Bootstrap Variables: Modify the
stack_parametersin your bootstrap_vars.yml file to use snapset images instead of base images - Preserve MAC Addresses: Ensure MAC addresses match those in the snapset
- Deploy Stack: Deploy the Heat stack with snapset images
- Revive Cluster: Use the revive functionality to restore OpenShift cluster state
When booting from snapset images, use the revive mode:
# In bootstrap_vars.yml
hotstack_revive_snapshot: trueThe revive process:
- Initial Stability Check: Waits for basic cluster stability
- Uncordon Nodes: Marks nodes as schedulable again
- Extended Stability: Waits for full cluster stability (multiple rounds)
- Service Restoration: Ensures all services are operational
The snapset data follows this structure:
snapset_data:
instances:
controller:
uuid: "instance-uuid"
role: "controller"
mac_address: "fa:16:9e:81:f6:5"
master0:
uuid: "instance-uuid"
role: "ocp_master"
mac_address: "fa:16:9e:81:f6:10"# Deploy and create snapset
ansible-playbook -i inventory.yml create-snapset.yml \
-e @scenarios/snapshot-sno/bootstrap_vars.yml \
-e @~/my_overrides.yml \
-e @~/cloud-secrets.yaml# List created images
openstack image list --tag hotstack-snapset
# Check specific snapset
openstack image list --tag snap_id=AbCdEfUpdate your bootstrap_vars.yml file or create an override file to use snapset images:
# In bootstrap_vars.yml
stack_parameters:
controller_params:
image: hotstack-controller-snapshot-AbCdEf
flavor: hotstack.small
ocp_master_params:
image: hotstack-master0-snapshot-AbCdEf
flavor: hotstack.xxlarge# Deploy using snapset images
ansible-playbook -i inventory.yml bootstrap.yml \
-e @scenarios/sno-bmh-tests/bootstrap_vars.yml \
-e @~/cloud-secrets.yaml \
-e hotstack_revive_snapshot=true- Certificate Rotation: SnapSets must be used within 30 days due to OpenShift's certificate rotation cycle. After 30 days, the cluster may require additional certificate management procedures
- Hotstack Scenarios
- OpenStack Image Management
- OpenShift Cluster Management
- Enabling OpenShift 4 Clusters to Stop and Resume Cluster VMs (Red Hat blog post explaining the bootstrap certificate issue)