Setting Up the Bare-Metal Infrastructure Provider
Learn to set up the Bare-Metal Infrastructure Provider for Omni
In this tutorial, we will set up a Bare-Metal Infrastructure Provider service for our Omni instance to be able to provision bare metal machines.
Requirements
An Omni instance (either on the Sidero Labs SaaS or self-hosted) with Admin access
Access to the Image Factory (either the public one hosted by Sidero Labs, or self-hosted)
Some bare-metal machines with:
BMC (baseboard management controller) power management capabilities via one of the following:
IPMI
Redfish
Outbound access to the Omni instance and the Image Factory
A machine/container/cluster etc. in the same subnet as the bare-metal machines, in order to run the infrastructure provider service
In this tutorial, we will assume:
Our managed Omni instance is running at
my-instance.omni.siderolabs.io
We will use the public Image Factory at
factory.talos.dev
Bare-Metal machines with IPMI support
An additional server with Docker installed to run the infrastructure provider service, with the IP address
172.16.0.42
within the subnet172.16.0.0/24
, reachable by the bare-metal machinesTwo bare-metal servers within the subnet
172.16.0.0/24
with access to the infrastructure provider, our Omni instance, and to the Image Factory
1. Creating an Omni service account
We start by setting up access for the infrastructure provider to Omni.
Navigate to the Settings - Infra Providers tab on the Omni web UI, and setup a new infra provider with ID bare-metal

Store the displayed service account key securely for later use.
2. Starting the provider
We will run the provider in a Docker container in our server with IP 172.16.0.42
.
The provider requires its following ports to be accessible:
50042
: HTTP and GRPC API port, customizable via--api-port
)69
: TFTP port used to provide iPXE binaries to PXE-booted machines
Start by getting the image reference of the latest version of the provider package.
At the time of writing, it is ghcr.io/siderolabs/omni-infra-provider-bare-metal:v0.1.0
, and this is what we shall use in this tutorial.
Set the required environment variables, using the service account key from the previous step:
export OMNI_ENDPOINT=https://my-instance.omni.siderolabs.io
export OMNI_SERVICE_ACCOUNT_KEY=eyJu...LS0ifQ==
Run the following command to start the provider service:
docker run -d --name=omni-bare-metal-infra-provider \
--restart=always \
--network host \
-e OMNI_ENDPOINT \
-e OMNI_SERVICE_ACCOUNT_KEY \
ghcr.io/siderolabs/omni-infra-provider-bare-metal:v0.1.0 \
--api-advertise-address=172.16.0.42
Make sure it is running by checking its status:
docker ps | grep omni-bare-metal-infra-provider
Sample output:
7190a326decf ghcr.io/siderolabs/omni-infra-provider-bare-metal:v0.1.0 "/provider --api-adv…" About a minute ago Up About a minute omni-bare-metal-infra-provider
Start tailing its logs in a separate shell:
docker logs -f omni-bare-metal-infra-provider
Sample output:
{"level":"info","ts":1734439242.1502001,"caller":"provider/provider.go:80","msg":"starting provider","options":{"Name":"Bare Metal","Description":"Bare metal infrastructure provider","OmniAPIEndpoint":"..."}
{"level":"info","ts":1734439242.1973493,"caller":"ipxe/handler.go:310","msg":"patch iPXE binaries","component":"ipxe_handler"}
{"level":"info","ts":1734439242.2833045,"caller":"ipxe/handler.go:316","msg":"successfully patched iPXE binaries","component":"ipxe_handler"}
{"level":"info","ts":1734439242.2870164,"caller":"provider/provider.go:221","msg":"start component","component":"COSI runtime"}
{"level":"info","ts":1734439242.28702,"caller":"provider/provider.go:221","msg":"start component","component":"TFTP server"}
{"level":"info","ts":1734439242.287044,"caller":"provider/provider.go:221","msg":"start component","component":"DHCP proxy"}
{"level":"info","ts":1734439242.2870617,"caller":"provider/provider.go:221","msg":"start component","component":"machine status poller"}
{"level":"info","ts":1734439242.2870378,"caller":"provider/provider.go:221","msg":"start component","component":"server"}
At this point, the provider has started and established a connection to our Omni instance.
3. Starting the Bare-Metal Machines
At this point, we can boot our bare-metal machines. Before we start, make sure that they are configured to boot over the network via PXE on the next boot, so that they can be booted by the provider.
Power cycle the machines, and when they attempt to boot via PXE, you can see in the provider logs that they have been PXE booted by the provider. Log messages should be similar to the lines below:
{"level":"info","ts":1734440893.9489105,"caller":"dhcp/proxy.go:140","msg":"offering boot response","component":"dhcp_proxy","source":"da:65:8c:d7:c7:81","boot_filename":"tftp://172.16.0.42/undionly.kpxe"}
{"level":"info","ts":1734440898.0781841,"caller":"tftp/tftp_server.go:103","msg":"file requested","component":"tftp_server","filename":"undionly.kpxe"}
{"level":"info","ts":1734440898.1611557,"caller":"tftp/tftp_server.go:123","msg":"file sent","component":"tftp_server","filename":"/var/lib/tftp/undionly.kpxe","bytes":88919}
{"level":"info","ts":1734440900.2585638,"caller":"dhcp/proxy.go:140","msg":"offering boot response","component":"dhcp_proxy","source":"da:65:8c:d7:c7:81","boot_filename":"tftp://172.16.0.42/undionly.kpxe"}
{"level":"info","ts":1734440900.317854,"caller":"ipxe/handler.go:97","msg":"handle iPXE request","component":"ipxe_handler","uuid":"cef9a5ee-71b7-48f1-8ce3-daf45e7be0a0","mac":"da-65-8c-d7-c7-81","arch":"i386"}
{"level":"info","ts":1734440900.3258681,"caller":"ipxe/handler.go:260","msg":"boot agent mode using image factory","component":"ipxe_handler"}
{"level":"info","ts":1734440902.685483,"caller":"server/server.go:110","msg":"request","component":"server","method":"GET","path":"/ipxe","duration":2.367639908}
4. Configuring and Accepting the Machines in Omni
At this point, the machines should be booted into the Agent Mode, and have established a SideroLink
connection to our Omni instance.
4.1. Verifying the Machines
Let's verify our machines:
Navigate to Machines - Pending tab on Omni web UI. You should see the machines pending acceptance:

Our machines have the following IDs:
33313750-3538-5a43-4a44-315430304c46
33313750-3538-5a43-4a44-315430304c47
For security reasons, the machines cannot be provisioned in Omni before they are "Accepted". We will accept these machines using the Omni API.
4.2. Optional: Providing BMC (e.g., IPMI/Redfish) Configuration Manually
Normally, when we accept a machine in Omni, the provider will auto-configure the BMC configuration, such as the IPMI IP, username and password automatically by asking the agent service running on the Talos machine.
Sometimes we do not want this behavior - instead, we already have this information at hand, and want to provide it manually, skipping this step.
To do this, before accepting these machines, we need to create a resource with these credentials in Omni.
Create a bmc-config.yaml
file with the following contents:
metadata:
namespace: default
type: InfraMachineBMCConfigs.omni.sidero.dev
id: 33313750-3538-5a43-4a44-315430304c46
spec:
ipmi:
address: 172.16.0.84
port: 623
username: admin
password: "MySuperSecretPw!_"
Fill it in with the information of your machine ID and IPMI configuration.
As an Omni admin, create this resource using omnictl
:
omnictl apply -f bmc-config.yaml
4.3. Accepting the Machines
As the final step, we need to accept these machines in Omni.
The following step will wipe the disks of these machines, so proceed with caution!
Simply click the "Accept" button on each machine under the Machines - Pending tab on the Omni web UI, and confirm the action:

Accepting the machine will wipe ALL disks. This is to ensure that the machine is in a clear state, and will be fully managed by Omni from that moment on.
When you do this, the provider will do the following under the hood:
Ask the Talos Agent service on the machines to configure their IPMI credentials
Retrieve these credentials and store them
Wipes the disks of these machines
Power off these machines over IPMI
Additionally, Omni will create a Machine
, and an InfraMachineStatus
resource for each machine. You can verify this by:
omnictl get machine;
omnictl get inframachinestatus;
Output will be similar to:
NAMESPACE TYPE ID VERSION ADDRESS CONNECTED REBOOTS
default Machine cef9a5ee-71b7-48f1-8ce3-daf45e7be0a0 4 fdae:41e4:649b:9303:8379:d4b5:cacc:e0bb true
default Machine d3796040-2a28-4e0f-ba1a-1944f3a41dde 4 fdae:41e4:649b:9303:8379:d4b5:cacc:e0bb true
NAMESPACE TYPE ID VERSION POWER STATE READY TO USE
infra-provider InfraMachineStatus cef9a5ee-71b7-48f1-8ce3-daf45e7be0a0 3 1 true
infra-provider InfraMachineStatus d3796040-2a28-4e0f-ba1a-1944f3a41dde 3 1 true
5. Adding Machines to a Cluster
We can now create a cluster using these machines. For this, simply follow the guide for creating a cluster.
When you add these machines to a cluster, the following will happen under the hood.
The provider will:
Power these machines on, marking their next boot to be a PXE boot
PXE boot them into Talos maintenance mode
Then Omni will proceed with the regular flow of:
Applying a configuration to the machine, causing Talos to be installed to the disk
Reboot (possibly using
kexec
)
The cluster will be provisioned as normal, and will get to the Ready
status.
6. Removing Machines from a Cluster
When you delete a cluster and/or remove some bare-metal machines from a cluster, the following will happen:
Omni does the regular de-allocation flow:
Remove the nodes from the cluster (leave
etcd
membership for control planes)Reset the machines
Afterwards, the provider will follow with these additional steps:
PXE boot the machine into Agent Mode (to be able to wipe its disks)
Wipe its disks
Power off the machine
At this point, these machines will again be ready to be allocated to a different cluster.
Last updated