Published Apr 26, 2019
In this blog post series, we will explore the Raspberry Pi platform and attempt to get the Pacemaker High Availability cluster stack running on it. The operating system of choice will be Fedora, in order to make it possible for readers to easily replicate the setup.
This is part 1 of the series, where we discuss the motivation behing this project, the background and required hardware, complete with the whys and hows.
Desire to have a cheap, small and portable high availability cluster setup for demos, with which people could physically interact and examine its behavior in an intuitive, easily understandable and observable form.
To understand why certain architectural decisions for this build were made, it is required to have at least a basic knowledge of High Availability (HA) clustering concepts.
Essentially, the goal of HA clusters is to eliminate single points of failure through redundancy, while automatically detecting and attempting to remedy failures as they occur (through a healthy mix of fault tolerance, resilience, failover and self-healing properties).
In practice, this is achieved by:
The Raspberry Pi 3 Model B+ was chosen as the ideal single-board computer for this build. This was mostly due to the fact that it supports an optional Power-over-Ethernet (PoE) add-on board, the PoE HAT, which we will use to eliminate the need for separate Micro-USB power supplies.
As a side note, HAT stands for Hardware Attached on Top and is a Raspberry Pi specific lingo for add-on boards. Other vendors have similarly bizarre names for them, for example BeagleBoard refers to them as capes.
In order to greatly reduce both cost and complexity (and keep the whole setup portable), the network and shared storage redundancy will be out of scope. We also decided not to use any UPS HATs either, mostly due to safety concerns (the PoE HAT tends to get very hot and could damage or quickly degrade the Li-ion cells).
Next up is the fencing mechanism. Since we already employ PoE to provide power to the cluster nodes, STONITH is the obvious choice here. For this to work, however, we will require a PoE-capable network switch.
Plenty of cheap “smart” “managed” switches are available on the market. None of those seem support programmatic access (via Simple Network Management Protocol (SNMP), for example) to turn power on and off for individual Ethernet ports.
The only product supporting SNMP Interfaces Management Information Base (IF-MIB) small enough to be viable for this build is the Mikrotik hEX PoE.
SNMP IF-MIB is a solid choice for our power fencing solution, as it is already implemented as the fence_ifmib I/O fence agent and widely used. Notice that since we power our nodes via PoE and disabling an ethernet port on our switch also cuts the power to that port, we have effectively turned an I/O fencing agent into a power fencing one. Neat, isn’t it?
We will not be using any extra hardware to provide a SAN for the cluster to use as shared storage. Instead, an iSCSI target will be hosted on an extra Pi. This machine will not be part of the cluster, it’s reliability and redundancy is out of scope for the same reasons mentioned earlier.
As we want this demo cluster to be human-interactive, some kind of input and output devices are needed to affect and monitor the state of the cluster.
Ethernet cables can be considered an input device in our case, since manually unplugging one of the cluster nodes from network will immediately power off that node and the rest of the cluster should notice this and proceed to act upon it (recovering any services that were running on the killed node, for one).
Physical buttons could be added to the individual cluster nodes, connected to their General-Purpose Input/Output (GPIO) pins and configured in the OS to trigger a kernel panic. Panicking a node will effectively kill it, without having to cut the power. We could then observe the cluster as it fences the node, turning its power off and on again – to reboot it in hope that it will come back and re-join the cluster.
As for the output devices, screens would be the obvious choice. We decided to go with a small [E-Ink] display for each Pi, specifically a tri-color (red/black/white) 2.13” Waveshare e-Paper HAT.
To spice things up a bit (and showcase more Pacemaker features :)), we also purchased a small thermal receipt printer, the PTP-II. The plan is to leverage Pacemaker alerts and have a log of all actions taken by the cluster, as a hard copy on a piece of paper.
This concludes the first part of this blog post series on building an inexpensive and small high availability cluster with the end goal of assisting in explaining the general concepts of HA clustering.
Stay tuned for part 2, in which we will show how the hardware physically fits together, how to install Fedora Linux OS on the Raspberry Pi 3 Model B+ and the first bumps in the road to having the perfect setup.
At time of writing the author works as a Quality Engineer at Red Hat, making sure the High Availability and Replicated Storage add-ons for Red Hat Enterprise Linux work as advertised.
Any and all opinions are personal.