Wrapic (Wireless Raspberry Pi Cluster)
Wrapic is a Wireless Raspberry Pi Cluster that can run various containerized applications on top of full Kubernetes. What makes the cluster “wireless” is that it doesn’t need to be physically connected to a router via ethernet, instead it bridges off WiFi to receive internet—this is great for situations where the router is inaccessible.
In my setup, a single 5-port PoE switch provides power to four RPi 4B’s all of which are equipped with PoE hats. One Raspberry Pi acts as a jump box connecting to an external network through WiFi and forwarding traffic through its ethernet port; this provides the other 3 RPi’s with an internet connection and separates the cluster onto its own private network. The jump box also acts as the Kubernetes master node and all other RPi’s are considered worker nodes in the cluster. The following setup documentation assumes this described setup where the master node doubles as a jump box—if a cluster without a jump box is desired, Alex Ellis’ Kubernetes on Raspian guide may be a better fit.
Contents
Most sections include a Side Notes subsection that includes extra information for that specific section ranging from helpful commands to potential issues/solutions I encountered during my setup.
- Parts List
- Initial Headless Raspberry Pi Setup
- Setting up the Jump Box and Cluster Network
- Install Docker and Kubernetes w/Flannel CNI
- Install MetalLB and ingress-nginx
- Extra Configurations
- References
As a disclaimer, most of these steps have been adapted from multiple articles, guides, and documentations found online which have been compiled into this README for easy access and a more straightforward cluster setup. Much credit goes to Alex Ellis’ Kubernetes on Raspian repository and Tim Downey’s Baking a Pi Router guide.
Parts List
My cluster only includes 4 RPi 4B’s though there is no limit to the amount of RPi’s that can be used. If you choose to not go the PoE route, additional micro USB cables and a USB power hub will be needed to power the Pi’s.
- 4x Raspberry Pi 4B 2GB RAM
- the 3B and 3B+ models will also suffice
- it is recommended to get at least 2GB of RAM for running full K8s
- 4x Official Raspberry Pi PoE Hats
- 5 Port PoE Gigabit Ethernet Switch
- does not need to support PoE if you are not planning to purchase PoE hats
- does not need to support gigabit ethernet though the Pi 4’s do support it
- 4x 0.5ft Ethernet Cables
- I went with 0.5ft cables to keep my setup compact
- at the very least, a Cat 5e cable is needed to support gigabit ethernet
- 4x 32GB Micro SD cards
- I’d recommend sticking to a reputable brand
- Raspberry Pi Cluster Case
- one with good ventilation and heat dissipation is recommended
Initial Headless Raspberry Pi Setup
In headless setup, only WiFi and ssh are used to configure the RPi’s without the need for an external monitor and keyboard. This will likely be the most tedious and time consuming part of the set up. These steps should be repeated individually for each RPi with only one RPi being connected to the network at a given time; this makes it easier to find and distinguish the RPi’s in step 5.
- Install Raspberry Pi OS Lite (32-bit) with Raspberry Pi Imager
- As an alternative, the Raspberry Pi OS (64-bit) beta may be installed instead if you plan to use arm64 Docker images or would like to use Calico as your K8s CNI; it is important to note that the 64-bit beta is the full Raspberry Pi OS which includes the desktop GUI and therefore may contain unneeded packages/bulk
- Another great option if an arm64 architecture is desired, is to install the officially supported 64-bit Ubuntu Server OS using the Raspberry Pi Imager
- Create an empty
ssh
file (no extension) in the root directory of the micro sd card -
Create a
wpa_supplicant.conf
in theboot
folder to set up a WiFi connection# /boot/wpa_supplicant.conf ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev update_config=1 country=US network={ ssid="<WiFi-SSID>" psk="<WiFi-password>" }
- The remote machine which will be used to configure and ssh into all the RPi’s should be on the same network as declared in the above
wpa_supplicant.conf
- The remote machine which will be used to configure and ssh into all the RPi’s should be on the same network as declared in the above
- Insert the micro SD card back into the RPi and power it on
ssh pi@raspberrypi.local
to connect to the RPi;ping raspberrypi.local
may also be used to get the RPi’s IP address to runssh pi@<ip-address>
sudo raspi-config
to access the RPi configuration menu for making the following recommended changes:- Change the password from its default
raspberry
- Change the hostname which can be used for easier ssh
- Expand the filesystem, under advanced options, allowing full use of the SD card for the OS
- Update the operating system to the latest version
- Change the locale
- Change the password from its default
- Reboot the RPi with
sudo reboot
- Set up passwordless SSH access
-
if you already have previously generated RSA public/private keys simply execute
ssh-copy-id <USERNAME>@<IP-ADDRESS or HOSTNAME>
-
sudo apt-get update -y
to update the package repositorysudo apt-get upgrade -y
to update all installed packages-
Disable swap with the following commands—it’s recommended to run the commands individually to prevent some errors with
kubectl get
later onsudo dphys-swapfile swapoff sudo dphys-swapfile uninstall sudo systemctl disable dphys-swapfile
- At this point, if you want to use zsh as the default shell for your RPi check out the Install zsh w/Oh-my-zsh and Configure Plugins section, otherwise move on to the next section which sets up the jump box
Side Notes
- May need to comment out
SendEnv LANG LC_*
in/etc/ssh/ssh_config
on host SSH client to fix RPi locale problems - Check if swap is disabled with
free -h
(look for “Swap:”); may also usesudo swapon —summary
which should return nothing - If swap is still not disabled after reboot, try editing
/etc/dphys-swapfile
and setCONF_SWAPSIZE=0
-
Although mentioned frequently, the disable swap command below did not seem to work on RPi Buster OS to fully disable swap (the commands mentioned in step 11 should be used instead)
sudo dphys-swapfile swapoff && sudo dphys-swapfile uninstall && sudo update-rc.d dphys-swapfile remove
Setting up the Jump Box and Cluster Network
The following steps will set up the RPi jump box such that it acts as a DHCP server and DNS forwarder. It is assumed that at this point all RPi’s have already been configured and are connected to the switch.
Preparation
Before the jump box is set up, it’s important to delete the wpa_supplicant.conf
files on all RPi’s except the jump box itself; this is because we want to force the RPi’s onto our private cluster network thats separated via our switch and jump box. The jump box will maintain its WiFi connection forwarding internet out its ethernet port and into the switch who then feeds it to the other connected RPi’s.
sudo rm /etc/wpa_supplicant/wpa_supplicant.conf
to delete thewpa_supplicant.conf
sudo reboot
for changes to take effect- Prior to steps 1 and 2, you could ssh into the RPi’s directly from your remote machine since they were on the same WiFi network
Jump Box Setup
-
Set up a static IP address for both ethernet and WiFi interfaces by creating a dhcpcd.conf in
/etc/
# /etc/dhcpcd.conf interface eth0 static ip_address=10.0.0.1 static domain_name_servers=<dns-ip-address> nolink interface wlan0 static ip_address=<static-ip-address> static routers=<router-ip-address> static domain_name_servers=<dns-ip-address>
- A sample
dhcpcd.conf
is provided here - Note that the static IP address for
wlan0
should be within the DHCP pool range on the router
- A sample
sudo apt install dnsmasq
to install dnsmasqsudo mv /etc/dnsmasq.conf /etc/dnsmasq.conf.backup
to backup the existingdnsmasq.conf
-
Create a new dnsmasq config file with
sudo nano /etc/dnsmasq.conf
and add the following# Provide a DHCP service over our eth0 adapter (ethernet port) interface=eth0 # Listen on the static IP address of the RPi router listen-address=10.0.0.1 # Declare DHCP range with an IP address lease time of 12 hours # 97 host addresses total (128 - 32 + 1) dhcp-range=10.0.0.32,10.0.0.128,12h # Assign static IPs to the kube cluster members (RPi K8s worker nodes 1 to 3) # This will make it easier for tunneling, certs, etc. # Replace b8:27:eb:00:00:0X with the Raspberry Pi's actual MAC address dhcp-host=b8:27:eb:00:00:01,10.0.0.50 dhcp-host=b8:27:eb:00:00:02,10.0.0.51 dhcp-host=b8:27:eb:00:00:03,10.0.0.52 # Declare name-servers (using Cloudflare's) server=1.1.1.1 server=1.0.0.1 # Bind dnsmasq to the interfaces it is listening on (eth0) # Commented out for now to help dnsmasq server start up bind-interfaces # Never forward plain names (without a dot or domain part) domain-needed # Never forward addresses in the non-routed address spaces. bogus-priv # Use the hosts file on this machine expand-hosts # Limits name services to dnsmasq only and will not use /etc/resolv.conf no-resolv # Uncomment to debug issues # log-queries # log-dhcp
- Note that the
listen-address
is the same as thestatic ip-address
foreth0
declared indhcpcd.conf
- If you have more or less than three worker nodes, declare or delete
dhcp-host
as needed ensuring that the correct MAC addresses are used ifconfig eth0
can be used to find each RPi’s MAC address (look next to “ether”)
- Note that the
sudo nano /etc/default/dnsmasq
and addDNSMASQ_EXCEPT=lo
at the end of the file- This is needed to prevent dnsmasq from overwriting
/etc/resolv.conf
on reboot which can crash the coredns pods when later initializing kubeadm
- This is needed to prevent dnsmasq from overwriting
sudo nano /etc/init.d/dnsmasq
and addsleep 10
to the top of the file to prevent errors with booting up dnsmasqsudo reboot
to reboot the RPi for dnsmasq changes to take effect- ssh back into the RPi jump box and ensure that dnsmasq is running with
sudo service dnsmasq status
sudo nano /etc/sysctl.conf
and uncommentnet.ipv4.ip_forward=1
to enable NAT rules with iptables-
Add the following
iptables
rules to enable port forwardingsudo iptables -t nat -A POSTROUTING -o wlan0 -j MASQUERADE sudo iptables -A FORWARD -i wlan0 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT sudo iptables -A FORWARD -i eth0 -o wlan0 -j ACCEPT
sudo apt install iptables-persistent
to install iptables-persistentsudo dpkg-reconfigure iptables-persistent
to re-save and persist ouriptables
rules across reboots
Side Notes
- If something goes wrong, I highly recommend checking out Tim Downey’s RPi router guide as additional information is provided there
sudo iptables -L -n -v
to check the currentiptables
rulescat /var/lib/misc/dnsmasq.leases
to check the current leases provided by dnsmasqsudo service dnsmasq restart
to restart dnsmasqsudo service dnsmasq stop
to stop dnsmasq (will restart on boot)
Install Docker and Kubernetes w/Flannel CNI
The following steps will install and configure Docker and Kubernetes on all RPi’s. This setup uses Flannel as the Kubernetes CNI although Weave Net may also be used as an alternative. Calico CNI may be swapped out for Flannel/Weave Net providing that an OS with an arm64
architecture has been installed on all RPi’s.
Worker Node Setup
These steps should be performed on all RPi’s within the cluster including the jump box/master node.
-
Install Docker
Install the latest version of Docker
curl -sSL get.docker.com | sh && sudo usermod pi -aG docker
- Note this specific script must be used as specified in the Docker documentation
Install a specific version of Docker
export VERSION=<version> && curl -sSL get.docker.com | sh sudo usermod pi -aG docker
- Where
<version>
is replaced with a specific Docker Engine version
-
sudo nano /boot/cmdline.txt
and add the following to the end of the line—do not make a new line and ensure that there’s a space in front ofcgroup_enable=cpuset
cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory
sudo reboot
to reboot the RPi for boot changes to take effect (do not skip this step)-
Install Kubernetes
Install the latest version of K8s
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - && \ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list && \ sudo apt-get update -q && \ sudo apt-get install -qy kubeadm
Install a specific version of K8s
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - && \ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list && \ sudo apt-get update -q && \ sudo apt-get install -qy kubelet=<version> kubectl=<version> kubeadm=<version>
- Where
<version>
is replaced with a specific K8s version; append-00
to the end of the version if it’s not already added (e.g. 1.19.5 => 1.19.5-00)
- Where
sudo sysctl net.bridge.bridge-nf-call-iptables=1
Master Node Setup
The following steps should be performed only on one RPi (I used the RPi jump box). This section assumes that you’re running an armhf
architecture on your RPi’s and therefore will use either Flannel or Weave Net as your cluster’s CNI.
sudo kubeadm config images pull -v3
to pull down the images required for the K8s master nodesudo nano /etc/resolv.conf
and ensure that it does not havenameserver 127.0.0.1
- If
nameserver 127.0.0.1
exists, remove it and replace it with another DNS IP address that isn’t the loopback address, then double check thatDNSMASQ_EXCEPT=lo
has been added in/etc/default/dnsmasq
to prevent dnsmasq from overwriting/addingnameserver 127.0.0.1
to/etc/resolv.conf
upon reboot - This step is crucial to prevent coredns pods from crashing upon running
kubeadm init
- If
-
Initialize the master node and save the
kubeadm join
command provided after thekubeadm init
finishes—note that the init command will depend on the CNI of your choosingFlannel
sudo kubeadm init --token-ttl=0 --pod-network-cidr=10.244.0.0/16
Weave Net
sudo kubeadm init --token-ttl=0
-
Run following commands after
kubeadm init
finishesmkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get pods -n kube-system
to double check the status of all master node pods (each should have a status of “Running”)- If the coredns pods are failing, see the Side Notes for this section
-
Apply the appropriate CNI config to your cluster
Flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Weave Net
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
-
Run the
kubeadm join
command saved in step 3, on all worker nodes, an example join command is provided belowkubeadm join 192.168.29.229:6443 --token 2t9e17.m8jbybvnnheqwwjp \ --discovery-token-ca-cert-hash sha256:4ca2fa33d228075da93f5cb3d8337931b32c8de280a664726fe6fc73fba89563
kubectl get nodes
to check that all nodes were joined successfully- At this point, all RPi’s should be set up and ready to run almost anything on top of K8s; however, if you’d like to expose services within your cluster for external access, follow the next section which will install a load balancer and ingress controller
Optionally, you can now follow the Kubernetes Dashboard Setup section to configure the Web UI for cluster monitoring
Side Notes
-
To uninstall K8s use the following commands
kubeadm reset sudo apt-get purge kubeadm kubectl kubelet kubernetes-cni kube* sudo apt-get autoremove sudo rm -rf ~/.kube
-
To uninstall Docker use the following commands
sudo apt-get purge docker-ce docker-ce-cli containerd.io sudo rm -rf /var/lib/docker sudo rm -rf /var/lib/containerd
-
If the coredns pods are stuck in
CrashLoopBackOff
and its logs are showing the error below, the referenced coredns docs recommends addingresolvConf: /etc/resolv.conf
to/etc/kubernetes/kubelet.conf
; however, the following steps resolved the issue since dnsmasq on the RPi jump box was overwritingresolv.conf
[FATAL] plugin/loop: Loop (127.0.0.1:34536 -> :53) detected for zone ".", see coredns.io/plugins/loop#troubleshooting
# reset kubeadm which will undo any joined nodes sudo kubeadm reset # edit resolv.conf which coredns references on startup # delete "nameserver 127.0.0.1" if it exists # add "nameserver 1.1.1.1" or any DNS resolver sudo nano /etc/resolv.conf # initialize k8s cluster again with the correct init command from step 3 sudo kubeadm init # it is important at this point to make sure that # DNSMASQ_EXCEPT=lo was added to the end of /etc/default/dnsmasq # this prevents the loopback address from being added back to resolv.conf on reboot
-
If
kubectl get
is throwing the error below, run the following commands to fix the issue without needing to reboot; additionally, this thread provides some guidance to help resolve this issue—running the disable swap commands individually seemed to do the trickThe connection to the server <ip-address>:6443 was refused - did you specify the right host or port?
sudo -i swapoff -a exit strace -eopenat kubectl version
-
If the Flannel pods are continuously crashing and its logs are showing the error below, this thread helped resolve the issue by running the following commands
Error registering network: failed to configure interface flannel.1: failed to ensure address of interface flannel.1: link has incompatible addresses. Remove additional addresses and try again
# check which node the failing flannel pod is on (check the IP) kubectl get pods -n kube-system -o wide # delete the flannel network interface (run on node found in above command) sudo ip link delete flannel.1 # delete the flannel pod kubectl delete pod -n kube-system kube-flannel-ds-<pod-id> # watch the status of the pods to ensure the flannel pod is running kubectl get pods -n kube-system -w
kubectl rollout restart -n kube-system deployment/coredns
to restart coredns podskubectl logs -n kube-system pod/coredns-<pod-id>
to get the logs of a specific coredns podkubectl logs -n kube-system kube-flannel-ds-<pod-id>
to get logs of a specific Flannel podkubectl label node <node-name> node-role.kubernetes.io/<role>=<role>
to label nodes<role>
should be the same if you’re setting the role for a node currently with a role set as<none>
kubectl label node <node-name> node-role.kubernetes.io/<role>-
to remove a label
Install MetalLB and ingress-nginx
The following steps have been taken directly from MetalLB’s Installation Documentation and the nginx-ingres Bare-metal Installation Documentation.
-
Create the
metallb-system
namespace with the followingkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.5/manifests/namespace.yaml
-
Create the MetalLB deployment with the following
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.5/manifests/metallb.yaml
-
Create a
memberlist
secret containing thesecretkey
to encrypt communication between speakerskubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
-
sudo nano metallb-config.yaml
to create a MetalLB config map with layer 2 mode configuration and paste the folllowing, where addresses is an address range of your choice—this address range should not conflict with the pod network CIDR defined duringkubeadm init
apiVersion: v1 kind: ConfigMap metadata: namespace: metallb-system name: config data: config: | address-pools: - name: default protocol: layer2 addresses: # sample address range - 192.168.1.240-192.168.1.250
kubectl apply -f metallb-config.yaml
to apply the configuration and start MetalLBkubectl get pods -n metallb-system
to ensure that all pods are running-
Install ingress-nginx with the command below
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/baremetal/deploy.yaml
-
Edit the
ingress-nginx-controller
and changespec.type
fromNodePort
toLoadBalancer
kubectl edit service ingress-nginx-controller -n ingress-nginx
kubectl get all -n ingress-nginx
to ensure that all ingress-nginx pods are running and all jobs (create/patch) completed successfully-
Verify that the ingress controller is working properly by doing the following
kubectl get service -n ingress-nginx # two services should be displayed: LoadBalancer and ClusterIP # copy the external-ip of your LoadBalancer # the external-ip should be within the address range of assigned in metallb-config.yaml curl http://<lb-external-ip> # curl should return html displaying "404 Not Found" # this indicates that the ingress-nginx-controller received the request and attempted to direct it to the correct pod # we have confirmed that our nginx ingress controller is working
<!-- sample response after curling the ingress-nginx LoadBalancer --> <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center> </body> </html>
-
Add the following urls to
iptables
to forward http/https traffic from wlan0 to the LoadBalancer’s external-ipsudo iptables -t nat -I PREROUTING -i wlan0 -p tcp --dport 80 -j DNAT --to <lb-external-ip>:80 sudo iptables -t nat -I PREROUTING -i wlan0 -p tcp --dport 443 -j DNAT --to <lb-external-ip>:443 # persis the iptables rules across reboots sudo dpkg-reconfigure iptables-persistent
Side Notes
-
If the
ingress-nginx-admission-patch
fails and does not show 1/1 completions when doingkubectl get all -n ingress-nginx
it may help to delete and reinstallingress-nginx
by doing the following# remove all resources related to ingress-nginx kubectl delete namespace ingress-nginx # verify that no resources from the ingress-nginx namespace exists kubectl get all -A # follow steps 8 and 9 again to reapply and configure ingress-nginx
Extra Configurations
This section includes instructions for various installations and configurations that are optional, but may be useful for your cluster needs.
Install zsh w/Oh-my-zsh and Configure Plugins
sudo apt-get install zsh
to install Z shell (zsh)chsh -s $(which zsh)
to install default shell to zshsudo apt-get install git wget
to install git and wget packages- make sure to install
git
and notgit-all
because git-all will replacesystemd
withsysv
consequently stopping both Docker and K8s processes; if you did accidentally installgit-all
see Side Notes below
- make sure to install
-
Install Oh-my-zsh framework
wget https://github.com/robbyrussell/oh-my-zsh/raw/master/tools/install.sh -O - | zsh cp ~/.oh-my-zsh/templates/zshrc.zsh-template ~/.zshrc source .zshrc
-
Install zsh syntax highlighting plugin
git clone https://github.com/zsh-users/zsh-syntax-highlighting.git mv zsh-syntax-highlighting ~/.oh-my-zsh/plugins echo "source ~/.oh-my-zsh/plugins/zsh-syntax-highlighting/zsh-syntax-highlighting.zsh" >> ~/.zshrc
-
Install zsh auto-suggestions plugin
git clone https://github.com/zsh-users/zsh-autosuggestions mv zsh-autosuggestions ~/.oh-my-zsh/custom/plugins
-
sudo nano ~/.zshrc
and modify the plugin list to include the followingplugins=(git docker kubectl zsh-autosuggestions)
source .zshrc
to refresh shell-
Install Powerlevel10k theme
git clone --depth=1 https://github.com/romkatv/powerlevel10k.git ${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/themes/powerlevel10k
-
sudo nano ~/.zshrc
and set theme to powerlevel10kZSH_THEME="powerlevel10k/powerlevel10k"
source .zshrc
and go through the p10k setup process
Side Notes
-
If
git-all
was accidentally installed, it is more than likely that PID 1 was configured to use sysv instead of systemd; check this by runningps -p 1
, if it shows “systemd” then you should be fine (reboot first and double check again to make sure), otherwise if “init” is displayed then follow the commands below# checks PID 1 which should display "init" (sysv) ps -p 1 # systemd-sysv will uninstall sysv and revert back to systemd sudo apt-get install systemd-sysv # reboot RPi for changes to take effect after sysv has been removed sudo reboot # after rebooting, double check that PID 1 is "systemd" ps -p 1 # K8s and Docker should both be running again if PID 1 is systemd # you do not need to repeat any K8s related setup again
-
If some commands can no longer be found while using zsh, it likely means your
$PATH
variable got screwed up; to fix this do the following# temporarily switch the default shell back to bash chsh -s $(which bash) # after typing in your password, close the terminal and log back into the RPi # once logged back into the RPi, your terminal should be back to using bash as the default shell # copy the output of the echo command below echo $PATH # update .zshrc to export the correct PATH variable on start up nano ~/.zshrc # uncomment and replace... export PATH=$HOME/bin:/usr/local/bin:$PATH # with... export PATH=<output-from-echo-$PATH> # exit nano # switch the default shell back to zsh chsh -s $(which zsh) # close the shell and ssh back into the RPi
- The
docker
andkubectl
Oh-my-zsh plugins adds tab completion for both commands; as a bonus, the kubectl plugin also adds aliases for common kubectl commands such ask
forkubectl
- If pasting to the terminal is slow add
DISABLE_MAGIC_FUNCTIONS="true"
to the top of~/.zshrc
and restart zsh withexec zsh
Kubernetes Dashboard Setup
This is a quick way to set up, run, and access the Kubernetes Dashboard remotely from another host outside the cluster network such as the computer used to ssh into the RPi cluster. These steps have been adapted from the official Kubernetes Dashabord documentation and Oracle’s Access the Kubernetes Dashboard guide.
-
Deploy the K8s dashboard from the master node with the following command
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml
kubectl proxy
to start the proxy forwarding traffic to the internal pod where the dashboard is running-
Use the command below to get the Bearer token needed to log in to the dashboard; alternatively, a sample user with a corresponding Bearer token can be created by following this guide
kubectl -n kube-system describe $(kubectl -n kube-system \ get secret -n kube-system -o name | grep namespace) | grep token:
-
Run the following command from the remote device where the dashboard will be accessed; replace
<username>
with username of the master node (the default ispi
) andip-address
with the ip address of the master node’s ip (may useifconfig wlan0
if the master node is the jump box)ssh -L 8001:127.0.0.1:8001 <username>@<ip-address>
-
Navigate to the following address on the remote device to access the dashboard UI
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
- Select the “Token” login option, paste the value of the token received in step 3, and click “Sign in”
Install Calico CNI
- did not work (see side notes)
- get calico yaml
curl https://docs.projectcalico.org/manifests/calico.yaml -O
- open
calico.yaml
in nano and search for192.168.0.0/16
-
uncomment and replace with:
- name: CALICO_IPV4POOL_CIDR value: "10.244.0.0/16"
-
kubectl create -f https://docs.projectcalico.org/manifests/tigera-operator.yaml
curl https://docs.projectcalico.org/manifests/custom-resources.yaml -O
- modify default IP pool CIDR to match pod network CIDR (10.244.0.0/16)
nano custom-resources
Side Notes
- Calico could be used but it would require installation of an arm64 Raspian image (currently in beta)
- Calico only supports amd64 and arm64 (as of 12/10)
Configure iTerm Window Arrangement and Profiles
ssh pi@routerPi.local
ssh -t pi@routerPi.local 'ssh pi@workerNode1.local'
ssh -t pi@routerPi.local 'ssh pi@workerNode2Pi.local'
ssh -t pi@routerPi.local 'ssh pi@workerNode3Pi.local'
Install Prometheus and Grafana
This section deploys Prometheus and Grafana to your cluster and exposes them externally through an Ingress. While there are many ways to deploy Prometheus and Grafana to K8s the kube-prometheus project makes this significantly easier without needing Helm or writing any yamls; unfortunately, kube-prometheus does not currently use Docker images that have support for armhf and will therefore fail to properly deploy on an armhf RPi cluster. Fortunately, Carlos Eduardo’s cluster-monitoring project has ported the kube-prometheus project to armhf which is what will be used in the following steps to deploy Prometheus and Grafana.
git clone https://github.com/carlosedp/cluster-monitoring && cd cluster-monitoring
sudo apt-get update -y && sudo apt-get install -y golang
to install go which is needed for some of the make commandssudo apt-get install -y build-essential
if make is not installed
ifconfig wlan0
on the RPi jump box to get the cluster’s external ip address (used in the next step)-
Configure the yaml files that will deploy Prometheus and Grafana to your cluster; follow the appropriate section if you’d like to record and display temperature metrics
With Temperature Metrics
# edit vars.jsonnet to enable temp metrics and configure the ingress ip address nano vars.jsonnet # set "enabled" to true for "armExporter" under "modules" # set "suffixDomain" to the ip address found in step 3 and append ".nip.io" to the end of it # e.g. 192.168.0.1 => 192.168.0.1.nip.io make vendor make make deploy
Without Temperature Metrics
# replace <ip-address> with the ip address found in step 3 make change_suffix suffix=<ip-address>.nip.io make deploy
- If an error occurs will applying the manifests rerun
make deploy
orkubectl apply -f manifests/
- If an error occurs will applying the manifests rerun
-
If you haven’t already, add the following urls to
iptables
to forward http and https traffic from wlan0 to the LoadBalancer’s external-ip which can be found by runningkubectl get ingress -n ingress-nginx
sudo iptables -t nat -I PREROUTING -i wlan0 -p tcp --dport 80 -j DNAT --to <lb-external-ip>:80 sudo iptables -t nat -I PREROUTING -i wlan0 -p tcp --dport 443 -j DNAT --to <lb-external-ip>:443 sudo dpkg-reconfigure iptables-persistent
kubectl get ingress -n monitoring
to get the external URLs (the “.nip.io” addresses) to access Prometheus, Alertmanager, and Grafana from outside the clusterkubectl get pods -n monitoring
to ensure that all pods are running properly- Check out Jeff Geerling’s RPi Cluster Episode 4 video as he walks through a very similar setup proccess on camera and shows how to configure the custom Grafana dashboard that comes with the cluster-monitoring project (around 17:19 minute mark)
Side Notes
-
If the prometheus-adapter pod is constantly crashing and throwing the error below, it may help to delete the entire monitoring namespace that was created by the cluster-monitoring project and redeploy kube-prometheus again with
make deploy
; I’d also reccomend deleting all the kube-proxy pods in thekube-system
namespace to manually restart them before redploying kube-prometheus againcommunicating with server failed: Get \"https://10.96.0.1:443/version?timeout=32s\": dial tcp 10.96.0.1:443: i/o timeout
-
If the kube-state-metrics pod is constantly crashing, I found that deleting the kube-flannel-ds pod on the same node seemed to resolve the issue;
sudo ip link delete flannel.1
should be run first before deleting the flannel pod off the same node
Install EFK Stack (Elasticsearch, Fluent Bit, Kibana)
This section deploys Fluent Bit to the RPi K8s cluster and installs Elasticsearch and Kibana on a reachable host outside the cluster. Unfortunately, the ELK stack currently has Docker images that supports only arm64
and amd64
architectures (theoretically, the full ELK stack could be run on an RPi K8s cluster providing that each node runs on an arm64 architecture); while custom armhf
Docker images could be built for the ELK stack, it’s a lot easier to just run ELK on another host outside the cluster network that has a 64 bit architecture. One major benefit to running both Elasticsearch and Kibana outside of the cluster is that it will not add extra CPU/memory load to the cluster—this is especially important if you have plans to run other resource intensive services.
Fluent Bit was choosen over Fluentd because it was a more lightweight solution over Fluentd that required less resources to run optimally within the cluster; it is important to note that while Fluentd has Docker images that support armhf
, the Fluentd Kubernetes DaemonSet images needed for Fluentd to push logs to Elasticsearch does not have armhf
support. Richard Youngkin talks about this in his guide, K8s Application Monitoring on a RPi Cluster and has already created a Docker image that can be used to run Fluentd on armhf
with Elasticsearch. The following steps have been adapted from Fluent Bit’s Kubernetes Guide and installation documentation on MacOS with brew
for Elasticsearch and Kibana.
-
On an external host outside the cluster that is on the same LAN as the RPi jump box (e.g. the laptop used to ssh into the cluster), execute the following commands to install Elasticsearch and Kibana (for MacOS only); for Linux and Windows, follow Installing Elasticsearch and Installing Kibana to download both packages as a zip
# MacOS only # add the elastic repository to brew brew tap elastic/tap # install elasticsearch brew install elastic/tap/elasticsearch-full # install kibana brew install elastic/tap/kibana-full
- Get the ip address of the external host which Elasticsearch and Kibana will be run on (use
ifconfig
); this is important so that we can bindlocalhost
to an actual ip address which Fluent Bit can access from within the cluster in later steps -
Execute the
configure.sh
script (pass in the ip address found in the previous step) located in this repository to configure Elasticsearch and Kibana if they were installed viabrew
for MacOS; for Linux and Windows navigate to theconfig/
folder in the unzipped Elasticsearch and Kibana packages to make the following changesUsing configure.sh
# use this script if Elasticsearch and Kibana were installed via brew ./configure.sh <ip-address> # for Linux/Windows, use the manual method below
Elasticsearch
# in config/elasticsearch.yml, add the following under the "Network" section # replace <ip-address> with the ip found in step 2 network.bind_host: <ip-address> http.port: 9200 transport.host: localhost transport.tcp.port: 9300
Kibana
# in config/kibana.yml, uncomment and modify the following # replace <ip-address> with the ip found in step 2 elasticsearch.hosts: ["http://<ip-address>:9200"]
-
In two separate terminal windows run
elasticsearch
andkibana
using the following commands# for MacOS (installed by brew) # in first window run elasticsearch # in second window run kibana # for Linux # navigate to the unzipped elasticsearch package # in first window run ./bin/elasticsearch # navigate to the unzipped kibana package # in second window run ./bin/kibana # for Windows # naviate to the unzipped elasticsearch package # in first window run .\bin\elasticsearch.bat # naviate to the unzipped kibana package # in second window run .\bin\kibana.bat
curl http://<ip-address>:9200
where<ip-address>
is the ip found in step 2, to check if Elasticsearch is running properly- Navigate to
<ip-address>:5601/status
to check if Kibana is running properly -
Run the following commands to create a
logging
namespace and configure the cluster for Fluent Bitkubectl create namespace logging kubectl apply -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-service-account.yaml kubectl apply -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role.yaml kubectl apply -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role-binding.yaml kubectl apply -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-configmap.yaml
-
Configure the Fluent Bit DaemonSet then apply it to all nodes within the cluster
# retrieve the Fluent Bit DaemonSet yaml and save it to a file curl https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-ds.yaml -o fluent-bit-ds.yaml # edit the Fluent Bit DaemonSet yaml and make the following changes as described above nano fluent-bit-ds.yaml # apply the edited DaemonSet to the cluster kubectl apply -f fluent-bit-ds.yaml
kubectl get pods -n logging
to ensure that all Fluent Bit pods are running properly and are forwarding logs to Elasticsearch- You should now start to see logs in Kibana after an index is created under the “Discover” page
Side Notes
brew services start elastic/tap/elasticsearch-full
to run Elasticsearch as a service on boot (MacOS only)brew services start elastic/tap/kibana-full
to run Kibana as a service on boot (MacOS only)
References
- Disabling Swap
- Alex Ellis’ K8s On Raspian Repo
- Tim Downey’s RPi Router Guide
- Richard Youngkin’s RPi K8s Cluster Guide
- Sean Duffy’s RPi K8s Cluster Guide
- Install zsh On Linux
- Remote Kubernetes Dashboard
- How To Expose Your Services With Kubernetes
- Carlos Eduardo’s Cluster Monitoring Repo
- Jeff Geerling’s RPi Cluster Episode 4
TODO
- need to fix headless RPi setup section such that only the master node/jump box has a wpa_supplicant created for it; all other nodes should be accessed via sshing into the master node first and then into the respective worker node
- simplify steps for fixing PATH var
- review if any steps can be simplified by using another package (e.g. sed, awk, etc.)
- ~add Weave Net instructions to docs~
- add iTerm2 profile config instructions
- set up ansible playbooks:
- RPi router configuration
- RPi disable swap and SSH key setup
- RPi kubernetes setup
- disable SSH password access (keys only)
- set up a reverse SSH tunnel to allow for direct SSH into worker nodes in the internal cluster network from MBP without needing to SSH from the Pi router
- add check to configure script to see if the configuration has been already to either the elasticsearch or kibana config files; if the config has already been added, simply modify the ip currently in the file