-
Notifications
You must be signed in to change notification settings - Fork 165
Node identity
Identifying a node is a surprisingly hard problem, mostly because we need to identify the node in some circumstances solely based on somewhat unreliable and mutable hardware characteristics of the node. There are two important times when Razor needs to determine what a node is, and whether it is different from all the other nodes it knows about:
- when the generic
bootstrap.ipxe
script retrieves instructions on how to boot - when the microkernel checks in with the Razor server
There are of course other places where a node and the server communicate, but those happen at times when the server already knows about the node and is in complete control of how the node identifies itself - in other words, there the server can encode its idea of node identity into the node's request, solving all identity worries.
The most difficult place to do node identification is from the generic
bootstrap.ipxe
file that you place on your TFTP server. What makes it
difficult is that we can only use information available to iPXE, which is
limited. If you never modify your hardware after first booting a node,
you're in good shape. Trouble comes if you ever move hardware components
between your nodes.
Currently, Razor gets the following pieces of information from iPXE to
guide its decision about when a node is a certain known node, and when it
is a completely new node; this data is collectively called the hardware
information (hw_info
) of the node. This information is updated every time
a node boots.
- MAC addresses of all the NIC's; actually the MAC addresses of the first
nic_max
network interfaces, a parameter you can pass in when you generate thebootstrap.ipxe
- The asset tag, serial, and uuid as reported by the SMBIOS
When bootstrap.ipxe
contacts the server, it sends the above pieces of
hardware information; the setting match_nodes_on
in config.yaml
determines which of these pieces is used to identify a node. By default,
only mac
, the MAC addresses, are used. The server goes looking for a
known node whose hardware information contains at least one of these
values. If there is no such node, Razor assumes it's seeing a new node; if
there is more than one, Razor will complain and not continue booting that
node.
The match_nodes_on
configuration should contain all the hardware
information that is unique across all machines that Razor manages (or empty
on machines where it is not set) Some vendors fill in some of these values
with nonsense values, for example, set the asset tag to no asset tag
. In
these cases, matching on asset tag would make Razor think all those nodes
are really just one node, and the rest of the hardware on that node changes
quickly. In this situation, match_nodes_on
must not contain asset
to
avoid confusing Razor.
If you never change existing hardware, you do not need to worry about confusing Razor. When you do change hardware, there are some things that you can do without any problems:
- Remove hardware from an existing node
- Add a piece of hardware, e.g. a new NIC, to a node, provided that NIC has not been used with Razor before
- Replace hardware with a new component, e.g. to replace a faulty network card
Things get complicated if you move hardware components between nodes that Razor knows about:
- if you move a component, e.g. a network card, from one node that Razor
knows about to another known node, Razor will complain about having two
nodes matching the same
hw_info
- if you move a component from a known node to a new node
new
, Razor will get very confused: when the new node boots, it will identify it as the known node; when the known node boots after that, it will appear as a completely new node.
To support the above two scenarios, we need a command to tell Razor that a piece of hardware is being moved; comment on this issue if this affects you.
For now, we assume that the Razor server is in control of booting the Microkernel. As such, the Microkernel command line receives a checkin URL that contains the node's internal identifier. This scheme will break down if you ever need to boot Microkernels from media, e.g. a CD. Please contact us if that affects you.