K8s - Deploying "hitless" and HA Load-balancing
In this blog, we will discuss how to deploy hitless and HA load-balancing (Stateful High Availability) with LoxiLB for a bare-metal Kubernetes deployment. Most readers will be aware that almost all workloads in Kubernetes are abstracted as services of different types. Kubernetes Services provide a way of abstracting access to a group of pods as a network service. The group of pods backing each service is usually defined using a label selector. There are three main types of services :
ClusterIP
NodePort
LoadBalancer
Most commonly used k8s service is usually the ClusterIP service which is used for internal communication inside a Kubernetes cluster, while LoadBalancer service is used when one needs to allow the access of the Kubernetes service from the outside world. This post is dedicated towards LoadBalancer service implementation. The entity or application implementing service type LoadBalancer ideally should have the following characteristics :
Implement K8s load-balancer spec and hook up with Kubernetes API server
Allocate a global IP (or IP from predefined range) for the particular service
Support wide range of end-point selection especially WRR, RR or Sticky
Support high-availability with state synchronization capability for hitless and fast failovers
Support some form of routing protocol (mostly BGP) to import and export the services
Additionally support scalable and independent end-point node health checks
Ability to log all LB connections and related state for audit purposes
Before we start, let’s quickly wrap our head around the need of state synchronization, LB clustering and fast-failovers. State in terms of load balancing means active session-state for example tcp states, conntrack, 5-tuple information, statistics etc. Even before the advent of http2 and WebSocket, we needed to preserve long-lived tcp connections for various applications. Some apps might simply take a long time to reconnect while some others might result in a traffic retransmit frenzy. For mission critical applications like 5G or RAN, side effects as we all know could be extremely adverse. Load-balancers deployed in clusters need to have a distributed intelligence about such states so that in case of a node failure, any other node in the LB pool can take-over the session forwarding without disruption or knowledge of the user, the application or the serving end-points.
We will deploy the following topology as a backdrop for this blog:

We will install a Kubernetes cluster comprising of four nodes and for providing hitless failover, we will deploy loxilb as a minimum cluster of two nodes. Each loxilb instance will use keepalive to determine a particular node’s state. The idea of loxilb HA clustering is to map each k8s LB service into pre-defined keepalive instances and use keepalive instance virtual IP address as their reachable point. For sake of simplicity, we will only have one instance for this blog. External service-IP addresses, with their next-hop, rewritten as virtual-IP of the instance will be advertised to the bgp peers of loxilb.
Under LoxiLB's hood
loxilb is based on eBPF and utilizes a wide variety of eBPF maps to maintain LB session states. It further utilizes eBPF programs hooked to kprobes to monitor changes to its eBPF maps and does necessary cluster-wide map synchronization with some help from Golang’s rpc package (not gRPC). The choice of using Golang's rpc vs gRPC is a topic for some other day.
loxilb works in three NAT modes:
0 - default (only DNAT)
1 - onearm (source IP is changed to load balancer’s interface IP)
2 - fullNAT (sourceIP is changed to virtual IP).
loxilb offers hitless clustering by distributing or categorizing services into different cluster instances. Each instance can have its own MASTER for its service traffic. In this way, traffic is distributed across loxilb nodes for different services since different instances can have different MASTER nodes. It also ensures optimal distribution of traffic. Furthermore, it can be achieved in two cases:
1) When we preserve the source address (default mode).
2) When we use some floating IP address as source IP (fullNAT mode).
Install k8s
The first step is to install k8s. For this blog, we have used this sample Vagrantfile to automate this process. loxilb uses its own custom cloud-provider which provides service LB management. The installation steps are as follows:
$ sudo apt install vagrant virtualbox
#Setup disksize
$ vagrant plugin install vagrant-disksize
#Run with Vagrantfile
$ vagrant up --no-provision
$ vagrant provision
Install LoxiLB
Download Ubuntu 20.04 VM iso image and spawn two VMs to run LoxiLB cluster.
Add two network adaptors(host) to communicate with the Kubernetes environment(internal) and external router. (Note: In advanced network settings, change promiscuous mode from deny to "allow all")
Once the VM is up and running, we need to run the following steps:
# Install necessary packages in loxilb VMs
$ sudo apt update
$ sudo apt install -y net-tools bridge-utils docker.io ssh docker-compose
$ sudo service ssh restart
loxilb in cluster mode runs in tandem with goBGP and keepalived. So, the first step is to create goBGP and keepalived configs in the system. You can download the configuration files used for this topology from here. We further use docker-compose to complete the loxilb setup. Note: Users are requested to change all configuration files as per their setup.
After downloading the files in each VM, loxilb docker can be spawned as:
$ cd llb1
$ sudo docker-compose up -d
Kindly repeat the same steps for spawning second node of loxilb. After the whole setup is up and running, we can verify node state in each loxilb node using the following:
$ sudo docker exec -it loxilb cat /etc/shared/keepalive.state
INSTANCE default is in MASTER state vip 11.11.11.11
Install LoxiLB CCM
loxilb provides a cloud provider plugin for K8s and in order to make it come into action. You can get the yaml file from here, change it if needed and apply it in one of the K8s nodes.
$ wget https://github.com/loxilb-io/loxi-ccm/raw/master/manifests/loxi-ccm-k3s.yaml
You may need to do some changes, find apiServerURL and replace the IP addresses with loxilb docker IPs (facing towards Kubernetes network):
data:
loxiccmConfigs: |
apiServerURL:
- "http://12.12.12.1:11111"
- "http://14.14.14.1:11111"
externalCIDR: "123.123.123.0/24"
setBGP: true
setLBMode: 2
Now, simply apply it :
$ sudo kubectl apply -f loxi-ccm-k3s.yaml
$ sudo kubectl get pods -n kube-system | grep loxi
loxi-cloud-controller-manager-55xqt 1/1 Running 0 21s
loxi-cloud-controller-manager-rvrgz 1/1 Running 0 21s
External Router configuration
We are using bird bgp on an external router. Its configuration can be found here. Save the configuration as /etc/bird/bird.conf and restart the bird service. One can run any bgp router in its place.
sudo apt-get install bird2 –yes
## Change bird.conf as needed
sudo systemctl restart bird
Kubernetes Nodes Networking
In Kubernetes, networking can be set up in different ways. Internal networks can be configured as L2 or L3 or VxLAN. Nodes may or may not run routing protocols such as bgp to advertise routes. In this blog, we have used Calico with Direct / NoEncapMode (unencapsulated) mode as default CNI. Calico bgp configuration can be downloaded here. To apply Calico BGP configuration :
$ sudo calicoctl apply -f calico-bgp-config.yaml
Verify BGP connections

By this step, the BGP topology/connections created is as follows:
LoxiLB Node(s):
$ sudo docker exec -it loxilb gobgp neigh
Peer AS Up/Down State |#Received Accepted
11.11.11.1 65001 01:07:18 Establ | 4 4
192.168.59.211 64512 01:07:22 Establ | 1 1
192.168.59.212 64512 01:07:22 Establ | 1 1
192.168.59.213 64512 01:07:22 Establ | 1 1
192.168.59.214 64512 01:07:22 Establ | 1 1
Calico Node(s):
$ vagrant ssh k8slx-01
………
$vagrant@node1:~$ sudo calicoctl node status
Calico process is running.
IPv4 BGP status
+----------------+-------------------+-------+------------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+----------------+-------------------+-------+------------+-------------+
| 192.168.59.212 | node-to-node mesh | up | 2023-01-04 | Established |
| 192.168.59.213 | node-to-node mesh | up | 2023-01-04 | Established |
| 192.168.59.214 | node-to-node mesh | up | 2023-01-04 | Established |
| 192.168.59.101 | global | up | 07:33:22 | Established |
| 192.168.59.111 | global | up | 07:17:56 | Established |
+----------------+-------------------+-------+------------+-------------+
Verifying Service
Let's create some worker services in k8s. We have used this iperf.yaml to create the service which runs the popular iperf tool. Apply it in K8s node:
$ vagrant@node1:~$ sudo kubectl apply -f iperf.yaml
service/iperf-service created
pod/iperf1 created
pod/iperf2 created
Check external LB service is created :
$ sudo kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
iperf-service LoadBalancer 10.233.5.183 123.123.123.2 55001:32573/TCP 24s
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 1d
This output confirms that the service of type LoadBalancer - “123.123.123.2: 55001" has been created. We can verify the service created in any loxilb node:
$ sudo docker exec -it loxilb loxicmd get lb -o wide
Last but not the least, we can check traffic to the newly created iperf service by running iperf client from external host :
$ sudo iperf -c 123.123.123.2 -t 5 -i 1 -p 55001
Quick Demo of hitless failover
Watch this video to see the hitless failover of an iperf service. In this video you will see two instances of loxilb running as a cluster. One loxilb instance will acquire the MASTER state and the other will be BACKUP. A Client will try to access the service using the service IP and port, the connection will be established and traffic will start. Master loxilb will do the connection tracking and the session state will be synced with the Backup instance. You will notice that when the Master loxilb instance is stopped, the BACKUP instance will become the new MASTER and the connection between the client and the service is intact.