Omni Documentation
Try OmniTalos Linux
  • Omni Documentation
  • Omni Support Matrix
  • Tutorials
    • Getting Started with Omni
    • Upgrading Omni Clusters
    • Installing Airgapped Omni
    • Using SAML and ACLs for fine-grained access control
    • Setting Up the Bare-Metal Infrastructure Provider
  • How-to guides
    • Using SAML with Omni
      • Add a User to Omni with SAML Enabled
      • Auto-assign roles to SAML users
      • Configure Workspace ONE Access for Omni
      • Configure Okta for Omni
      • Configure Entra ID AD for Omni
      • Configure Unifi Identity Enterprise for Omni
    • Register machines with Omni
      • Register a Bare Metal Machine (ISO)
      • Register a Bare Metal Machine (PXE/iPXE)
      • Register an AWS EC2 Instance
      • Register an Azure Instance
      • Register a GCP Instance
      • Register a Hetzner Server
    • Create a Cluster
    • Install talosctl
    • Install and Configure Omnictl
    • Use Kubectl With Omni
    • Create a Kubeconfig for a Kubernetes Service Account
    • Create a Patch For Cluster Machines
    • Manage Access Policies (ACLs)
    • Create a Hybrid Cluster
    • Run Omni on your own infrastructure
      • Deploy Omni On-prem
      • Configure Keycloak for Omni
      • Back Up On-prem Omni Database
      • How to expose Omni with Nginx (HTTPS)
    • Install Talos Linux Extensions
    • Scale a Cluster Up or Down
    • Etcd backups
    • Restore Etcd of a Cluster Managed by Cluster Templates
    • Create an Omni Service Account
    • Create a Machine Class
    • Expose an HTTP Service from a Cluster
    • Export a Cluster Template from a Cluster Created in the UI
    • Audit logs
    • Set Initial Machine Labels Using Omnictl or Image Factory
  • Explanation
    • Machine Registration
    • Authentication and Authorization
    • Omni KMS Disk Encryption
    • Infrastructure Providers
  • Reference
    • omnictl CLI
    • Access Policies (ACLs)
    • Generating omnictl CLI reference
    • Cluster Templates
Powered by GitBook
On this page
  • Finding the Cluster's UUID
  • Finding the Snapshot to Restore
  • Deleting the Existing Control Plane
  • Creating the Restore Template
  • Syncing the Template
  • Restarting Kubelet on Worker Nodes
Edit on GitHub
Export as PDF
  1. How-to guides

Restore Etcd of a Cluster Managed by Cluster Templates

This guide shows you how to restore a cluster's etcd to an earlier snapshot. This is useful when you need to revert a cluster to an earlier state.

This tutorial has the following requirements:

  • The CLI tool omnictl must be installed and configured.

  • The cluster which you want to restore must still exist (not be deleted from Omni) and have past backups available.

  • The cluster must be managed using cluster templates (not via the UI).

Finding the Cluster's UUID

To find the cluster's UUID, run the following command, replacing my-cluster with the name of your cluster:

omnictl get clusteruuid my-cluster

The output will look like this:

NAMESPACE   TYPE          ID              VERSION   UUID
default     ClusterUUID   my-cluster      1         bb874758-ee54-4d3b-bac3-4c8349737298

Note the UUID column, which contains the cluster's UUID.

Finding the Snapshot to Restore

List the available snapshots for the cluster:

omnictl get etcdbackup -l omni.sidero.dev/cluster=my-cluster

The output will look like this:

NAMESPACE   TYPE         ID                         VERSION     CREATED AT                         SNAPSHOT
external    EtcdBackup   my-cluster-1701184522   undefined   {"nanos":0,"seconds":1701184522}   FFFFFFFF9A99FBF6.snapshot
external    EtcdBackup   my-cluster-1701184515   undefined   {"nanos":0,"seconds":1701184515}   FFFFFFFF9A99FBFD.snapshot
external    EtcdBackup   my-cluster-1701184500   undefined   {"nanos":0,"seconds":1701184500}   FFFFFFFF9A99FC0C.snapshot

The SNAPSHOT column contains the snapshot name which you will need to restore the cluster. Let's assume you want to restore the cluster to the snapshot FFFFFFFF9A99FBFD.snapshot.

Deleting the Existing Control Plane

To restore the cluster, we need to first delete the existing control plane of the cluster. This will take the cluster into the non-bootstrapped state. Only then we can create the new control plane with the restored etcd.

Use the following command to delete the control plane, replacing my-cluster with the name of your cluster:

omnictl delete machineset my-cluster-control-planes

Creating the Restore Template

Edit your cluster template manifest template-manifest.yaml, edit the list of control plane machines for your needs, and add the bootstrapSpec section to the control plane, with cluster UUID and the snapshot name we found above:

kind: Cluster
name: my-cluster
kubernetes:
  version: v1.28.2
talos:
  version: v1.5.5
---
kind: ControlPlane
machines:
  - 430d882a-51a8-48b3-ae00-90c5b0b5b0b0
  - e865efbc-25a1-4436-bcd9-0a431554e328
  - 820c2b44-568c-461e-91aa-c2fc228c0b2e
bootstrapSpec:
  clusterUUID: bb874758-ee54-4d3b-bac3-4c8349737298 # The cluster UUID we found above
  snapshot: FFFFFFFF9A99FBFD.snapshot # The snapshot name we found above
---
kind: Workers
machines:
  - 18308f52-b833-4376-a7c8-1cb9de2feafd
  - 79f8db4d-3b6b-49a7-8ac4-aa5d2287f706

Syncing the Template

To sync the template, run the following command:

omnictl cluster template sync -f template-manifest.yaml
omnictl cluster template status -f template-manifest.yaml

After the sync, your cluster will be restored to the snapshot you specified.

Restarting Kubelet on Worker Nodes

To ensure a healthy cluster operation, the kubelet needs to be restarted on all worker nodes.

omnictl talosconfig -c my-cluster

Get the IDs of the worker nodes:

omnictl get clustermachine -l omni.sidero.dev/role-worker,omni.sidero.dev/cluster=my-cluster

The output will look like this:

NAMESPACE   TYPE             ID                                     VERSION
default     ClusterMachine   26b87860-38b4-400f-af72-bc8d26ab6cd6   3
default     ClusterMachine   2f6af2ad-bebb-42a5-b6b0-2b9397acafbc   3
default     ClusterMachine   5f93376a-95f6-496c-b4b7-630a0607ac7f   3
default     ClusterMachine   c863ccdf-cdb7-4519-878e-5484a1be119a   3

Gather the IDs in this output, and issue a kubelet restart on them using talosctl:

talosctl -n 26b87860-38b4-400f-af72-bc8d26ab6cd6 service kubelet restart
talosctl -n 2f6af2ad-bebb-42a5-b6b0-2b9397acafbc service kubelet restart
talosctl -n 5f93376a-95f6-496c-b4b7-630a0607ac7f service kubelet restart
talosctl -n c863ccdf-cdb7-4519-878e-5484a1be119a service kubelet restart
PreviousEtcd backupsNextCreate an Omni Service Account

Last updated 9 months ago

For this step, you need to be installed and talosconfig to be configured for this cluster. You can download talosconfig using the Web UI or by

talosctl