So - you want to stop your OpenShift cluster? There are many reasons why you may want to stop your OpenShift cluster. Maybe you have an annual disaster recovery test where you shut down a whole datacenter. Perhaps you want to do some maintenance to your infrastructure or the hypervisor or storage that your cluster is hosted on. It's not an uncommon to need to be able to do this, so I have collated some of the best practices I have experienced across a multitude of environments, both large and small.
Here is the process that I recommend to use as a best practice in order to stop and start your OpenShift cluster(s). Following this process will give you the best chance of a trouble free maintenance window. As with all things, you should exercise care with this process on your important clusters. Try it on an unimportant environment first and see if it is a good fit for you.
Important: This process will cause an outage to any application workload running on the cluster until the cluster is fully started. The cluster itself will be unavailable until manually started. Care should be taken to run this process only on appropriate environments. It is recommended to have backups available of your environment.
Stopping your cluster
Stopping a cluster is a relatively straight forward process. The real difficulty is always related to the workload on top of the cluster. You should also ensure that you have the ability to roll things back if anything goes wrong. This means that you need to be confident in your ability to rebuild a cluster; have a mature, tested backup process in place (you're doing this anyway, right?) and have an understanding of the workload you are running.
Know your state
The first thing I like to do is to ensure that I know what state my cluster is in before I carry out this type of activity. I want to know what namespaces I have (
oc projects ), what state pods are in (
oc get pods --all-namespaces ) and what state my nodes are in (
oc get nodes --show-labels ). Knowing what is working and what isn't is important when it comes to knowing what 'the same as before' looks like after you have started the cluster again.
You should resolve any issues in the cluster before proceeding– For example
oc get nodes should show that all nodes are ready. Guidance around more in-depth checks can be found in the documentation.
Tip: If node configurations are not properly configured, labels will not be correctly applied. Ensure that node configs are properly applied in order to retain node labels during restart activity. Refer to the documentation for more information.
Plan for failure
Next, cut a backup of your cluster. You are probably doing this on a regular basis but if you're not you should take a look at tooling like https://velero.io/ which helps you to regularly snapshot your cluster state and persistent storage and ship it to one of a number of storage targets such as S3. If you want to roll your own, you can refer to the documentation which details what you need to capture as part of an environment backup.
Tip: Powering nodes off as opposed to a graceful shutdown of a running cluster is a bad idea, as applications and services may be left in an inconsistent state. Ensure that you have appropriate backups in place according to the documentation.
If the answer to the question 'can I put it all back if I need to?' isn't yes, then don't continue.
Stop your applications
Now that we know what our cluster looks like and we know that we can restore in the event of any issues, we can stop our application workload. Ensuring that application workload is stopped in a controlled manner helps us to avoid data loss.
Tip: Care should be taken with application workloads to ensure that they are stopped gracefully to prevent accidental data loss. Data consistency is important!
Your first option is to you scale down all of your deployment configs so that no application pods are running. You may want to consider idling pods as a way to stop workload dynamically, or crafting Ansible playbooks in order to restart the correct number of replica’s when we start the cluster later. Either way you will need to know how many replica's you are putting back.
Your second option is that you may wish to drain the workload from all worker nodes rather than stopping all applications individually in order to ensure application consistency. In this case you would need to:
1a. Cordon all of your worker nodes to prevent new pods from starting or moving
oc adm cordon <node>. Refer to the documentation about scheduling nodes.
1b. Drain all of your worker nodes using something like:
oc adm drain <node> --ignore-daemonsets --force --grace-period=30 --delete-local-data. This forces pods to stop, ignores any daemonsets that are running on the nodes, enforces a graceful termination period of 30s for pods to stop gracefully and removes any pods with local ephemeral data.
It is recommended that you review the options for draining nodes in the documentation.
Your third option is to gracefully shutdown your worker nodes without draining your application workload. This approach works well if you don't have any stateful services like databases or you have designed your application to tolerate failure. A graceful shutdown causes your application processes to stop as part of a system shutdown. Eventually any processes which have not stopped gracefully will be forced to terminate. Pods that were running at the time a cluster stops will start again on the worker that it was last scheduled on. This enables workload to start again in-place during a cluster startup quickly after an event such as a power loss.
Stop your cluster
Tip: Block storage volumes which have been dynamically provisioned through a cloud provider like AWS EBS or VMWare vSphere will remain attached to any nodes where pods were running with persistent storage unless that workload is stopped.
- Gracefully shutdown all worker nodes – For example:
shutdown -h now. All workers need to be shutdown together or cordoned, otherwise OpenShift may attempt to reschedule workload.
- Gracefully shutdown all Infra nodes - For example:
shutdown -h now.
- Gracefully shutdown all masters - For example:
shutdown -h now.
Starting your cluster
Bringing the cluster back up is much more simple than the shutdown procedure. You just have to start nodes in the right order for the best results.
Tip: Additional guidance on the subject of checking cluster health can be found as part of the day2 operations guide
Start your master nodes
Once they have booted we can check that they are healthy using
oc get nodes – all nodes should be in a
ready state before continuing on to your infra nodes.
Start your infra nodes
Once your infra nodes have booted you can ensure that infra nodes are showing in a ready state, and that
oc get pods –all-namespaces shows the logging, metrics, router and registry pods have started and are healthy.
Start your worker nodes
Once your worker nodes have booted you can ensure that all nodes are showing in a ready state with
oc get nodes. Refer to the health check documentation for a more in-depth set of checks.
Start your applications
Now that your cluster has started and you've proven that it is healthy, you can now start your application workload. If you chose to simply shutdown your worker nodes without draining workload then your applications will be restarting on the nodes they were previously located, otherwise you will need to increase the number of replica's or uncordon nodes depending on the approach you took.
Finally, check that your application pods have started correctly
oc get pods --all-namespaces and perform any checks that may be necessary on your application to prove that it is available and healthy.
- OpenShift Day 2 Operations guide: https://docs.openshift.com/container-platform/3.11/day_two_guide/environment_health_checks.html
- Red Hat verified solution for shutting down or restarting a cluster. https://access.redhat.com/solutions/3499881
- Documentation about scheduling nodes: https://docs.openshift.com/container-platform/3.11/admin_guide/manage_nodes.html#marking-nodes-as-unschedulable-or-schedulable
- Documentation about draining nodes https://docs.openshift.com/container-platform/3.11/admin_guide/manage_nodes.html#evacuating-pods-on-nodes
- Documentation about idling pods: https://docs.openshift.com/container-platform/3.11/admin_guide/idling_applications.html
- Documentation regarding cluster backups https://docs.openshift.com/container-platform/3.11/day_two_guide/environment_backup.html
- Documentation about how to label nodes https://docs.openshift.com/container-platform/3.11/admin_guide/manage_nodes.html#modifying-nodes
- Documentation covering cluster health checks https://docs.openshift.com/container-platform/3.11/day_two_guide/environment_health_checks.html
- Velero project website: https://velero.io/