K8s - Deploying "hitless" and HA Load-balancing

LoxiLB
Jan 17, 2023
8 min read

Updated: Oct 23, 2024

In this blog, we will discuss how to deploy hitless and HA load-balancing (Stateful High Availability) with LoxiLB for a bare-metal Kubernetes deployment. Most readers will be aware that almost all workloads in Kubernetes are abstracted as services of different types. Kubernetes Services provide a way of abstracting access to a group of pods as a network service. The group of pods backing each service is usually defined using a label selector. There are three main types of services :

ClusterIP
NodePort
LoadBalancer

Most commonly used k8s service is usually the ClusterIP service which is used for internal communication inside a Kubernetes cluster, while LoadBalancer service is used when one needs to allow the access of the Kubernetes service from the outside world. This post is dedicated towards LoadBalancer service implementation. The entity or application implementing service type LoadBalancer ideally should have the following characteristics :

Implement K8s load-balancer spec and hook up with Kubernetes API server
Allocate a global IP (or IP from predefined range) for the particular service
Support wide range of end-point selection especially WRR, RR or Sticky
Support high-availability with state synchronization capability for hitless and fast failovers
Support some form of routing protocol (mostly BGP) to import and export the services
Additionally support scalable and independent end-point node health checks
Ability to log all LB connections and related state for audit purposes

Before we start, let’s quickly wrap our head around the need of state synchronization, LB clustering and fast-failovers. State in terms of load balancing means active session-state for example tcp states, conntrack, 5-tuple information, statistics etc. Even before the advent of http2 and WebSocket, we needed to preserve long-lived tcp connections for various applications. Some apps might simply take a long time to reconnect while some others might result in a traffic retransmit frenzy. For mission critical applications like 5G or RAN, side effects as we all know could be extremely adverse. Load-balancers deployed in clusters need to have a distributed intelligence about such states so that in case of a node failure, any other node in the LB pool can take-over the session forwarding without disruption or knowledge of the user, the application or the serving end-points.

We will deploy the following topology as a backdrop for this blog:

We will install a Kubernetes cluster comprising of four nodes and for providing hitless failover, we will deploy LoxiLB as a minimum cluster of two nodes. kube-loxilb will monitor the health status of each LoxiLB instance and elect MASTER and BACKUP. kube-loxilb will also take care of configuring bgp on each LoxiLB instances. MASTER LoxiLB will advertise the service IP to the bgp peers of LoxiLB and hence, be made the reachable point for the created services.

Under LoxiLB's hood

LoxiLB is based on eBPF and utilizes a wide variety of eBPF maps to maintain LB session states. It further utilizes eBPF programs hooked to kprobes to monitor changes to its eBPF maps and does necessary cluster-wide map synchronization. How we have achieved that is covered in this blog.

LoxiLB works in three NAT modes:

0 - default (only DNAT)
1 - onearm (source IP is changed to load balancer’s interface IP)
2 - fullNAT (sourceIP is changed to virtual IP).

LoxiLB offers hitless clustering by distributing or categorizing services into different cluster instances. Each instance can have its own MASTER for its service traffic. In this way, traffic is distributed across LoxiLB nodes for different services since different instances can have different MASTER nodes. It also ensures optimal distribution of traffic. Furthermore, it can be achieved in two cases:

1) When we preserve the source address (default mode).

2) When we use some floating IP address as source IP (fullNAT mode).

Install k8s and LoxiLB

The first step is to install k8s. For this blog, we have used this sample Vagrantfile to automate this process. The installation steps are as follows:

$ sudo apt install vagrant virtualbox

#Setup disksize
$ vagrant plugin install vagrant-disksize

#Run with Vagrantfile
$ vagrant up --no-provision
$ vagrant provision

This Vagrant file will spawn two VMs to run LoxiLB in cluster mode with BGP. It will add two network adaptors(host) to communicate with the Kubernetes environment(internal) and external router.

LoxiLB VMs will start the docker process but if user wants to run them manually, it should be done as below:

#llb1
 docker run -u root --cap-add SYS_ADMIN   --restart unless-stopped --privileged -dit -v /dev/log:/dev/log --name loxilb ghcr.io/loxilb-io/loxilb:latest  --cluster=$llb2IP --self=0 -b

#llb2
 docker run -u root --cap-add SYS_ADMIN   --restart unless-stopped --privileged -dit -v /dev/log:/dev/log --name loxilb ghcr.io/loxilb-io/loxilb:latest --cluster=$llb1IP --self=1 -b

LoxiLB in cluster mode runs in tandem with goBGP. kube-loxilb takes care of the goBGP configuration and policies in the LoxiLBs.

Install kube-loxilb

LoxiLB provides kube-loxilb, a loadbalancer spec implementation for K8s and in order to make it come into action. You can download kube-loxilb from github, change it if needed and apply it in one of the K8s nodes.

$ cd kube-loxilb/manifest/ext-cluster
$ vi kube-loxilb.yaml

You may need to do some changes, find apiServerURL and replace the IP addresses with LoxiLB docker IPs (facing towards Kubernetes network):

containers:
     - name: kube-loxilb
       image: ghcr.io/loxilb-io/kube-loxilb:latest
       imagePullPolicy: Always
       command:
       - /bin/kube-loxilb
       args:
       - --setRoles=0.0.0.0
       - --loxiURL=http://192.168.80.252:11111,http://192.168.80.253:11111
       - --cidrPools=defaultPool=20.20.20.1/32
       - --setLBMode=2
       - --setBGP=65111
       - --extBGPPeers=192.168.90.9:64512
       - --listenBGPPort=1791 #Mandatory to mention if running with Calico CNI

Now, simply apply it :

$ sudo kubectl apply -f kube-loxilb.yaml 
serviceaccount/kube-loxilb created clusterrole.rbac.authorization.k8s.io/kube-loxilb created clusterrolebinding.rbac.authorization.k8s.io/kube-loxilb created deployment.apps/kube-loxilb created

Note: Using "setRoles" options enables mastership arbitration in kube-loxilb, which can detect and set active-standby roles for LoxiLB instances usually within a few seconds. This option should be enabled in absence of other reliable external active-standy detection mechanism.

Kubernetes Nodes Networking

In Kubernetes, networking can be set up in different ways.Nodes may or may not run routing protocols such as bgp to advertise routes. In this blog, we have used loxilb-peer for BGP in the K8s nodes. loxilb-peer working together with kube-loxilb forms an auto peering BGP mesh across the K8s nodes as well as LoxiLB instances. loxilb-peer.yml configuration can be downloaded from here. To apply loxilb-peer configuration :

$ sudo kubectl apply -f loxilb-peer.yaml
daemonset.apps/loxilb-peer created
service/loxilb-peer-service created

You can verify that the Deployment has been created in the kube-system namespace of k8s with the following command:

$ sudo kubectl get ds -n kube-system
NAME          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
calico-node   3         3         3       3            3           kubernetes.io/os=linux   17h
kube-proxy    3         3         3       3            3           kubernetes.io/os=linux   17h
loxilb-peer   3         3         3       3            3           kubernetes.io/os=linux   17h

External Router configuration

We are using bird bgp on an external router. Its configuration can be found here. Save the configuration as /etc/bird/bird.conf and restart the bird service. One can run any bgp router in its place.

sudo apt-get install bird2 –yes
## Change bird.conf as needed
sudo systemctl restart bird

Verify BGP connections

By this step, the BGP topology/connections created is as follows:

LoxiLB Node(s):

$ sudo docker exec -it loxilb gobgp -p 50052 neigh
Peer              AS  Up/Down State       |#Received  Accepted
192.168.80.201 64511 00:25:19 Establ      |        0         0
192.168.80.202 64511 00:25:17 Establ      |        0         0
192.168.80.250 64511 00:25:17 Establ      |        0         0
192.168.80.253 64511 17:55:35 Establ      |        1         1
192.168.90.9   64512 17:55:37 Establ      |        1         1

Client Node:

$ vagrant ssh host
………
$vagrant@host1:~$ sudo birdc show protocols
BIRD 2.0.8 ready.
Name       Proto      Table      State  Since         Info
device1    Device     ---        up     2023-12-11    
direct1    Direct     ---        up     2023-12-11    
kernel1    Kernel     master4    up     2023-12-11    
kernel2    Kernel     master6    up     2023-12-11    
static1    Static     master4    up     2023-12-11    
my_routes  Static     master4    up     2023-12-11    
llb1       BGP        ---        up     03:19:03.618  Established   
llb2       BGP        ---        up     03:23:23.883  Established

Create a services using LoadBalancerClass

You can now use LoxiLB as a load balancer by specifying a LoadBalancerClass. For testing, we will create 4 TCP and SCTP services(2 each in default and fullnat mode). Yaml files for creating services can be found here.

The yaml file creates the load balancer services and one pod associated with it. The service has loxilb.io/loxilb specified as the loadBalancerClass. By specifying a loadBalancerClass with that name, kube-loxilb will detect the creation of the service and associate it with the LoxiLB load balancer.

Create services with the following command:

vagrant@master~$: sudo kubectl apply -f tcp_fullnat.yml
service/tcp-lb-fullnat created
pod/tcp-fullnat-test created

vagrant@master~$: sudo kubectl apply -f tcp_default.yml
service/tcp-lb-default created
pod/tcp-default-test created

vagrant@master~$: sudo kubectl apply -f sctp_fullnat.yml
service/sctp-lb-fullnat created
pod/sctp-fullnat-test created

vagrant@master~$: sudo kubectl apply -f sctp_default.yml
service/sctp-lb-default created
pod/sctp-default-test created

Check external LB service is created :

$ sudo kubectl get svc
NAME            TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)           AGE
sctp-lb-default   LoadBalancer   172.17.48.26    llb-20.20.20.1   56005:32296/SCTP   17h
sctp-lb-fullnat   LoadBalancer   172.17.32.127   llb-20.20.20.1   56004:30253/SCTP   17h
tcp-lb-default    LoadBalancer   172.17.4.3      llb-20.20.20.1   56003:31108/TCP    17h
tcp-lb-fullnat    LoadBalancer   172.17.54.74    llb-20.20.20.1   56002:32383/TCP    17h

This output confirms that the services of type LoadBalancer with external IP “20.20.20.1" has been created. We can verify the service created in any LoxiLB node:

$ sudo docker exec -it loxilb loxicmd get lb
|   EXT IP   | PORT  | PROTO |          NAME           | MARK | SEL |  MODE   | # OF ENDPOINTS | MONITOR |
|------------|-------|-------|-------------------------|------|-----|---------|----------------|---------|
| 20.20.20.1 | 56002 | tcp   | default_tcp-lb-fullnat  |    0 | rr  | fullnat |              1 | On      |
| 20.20.20.1 | 56003 | tcp   | default_tcp-lb-default  |    0 | rr  | default |              1 | On      |
| 20.20.20.1 | 56004 | sctp  | default_sctp-lb-fullnat |    0 | rr  | fullnat |              1 | On      |
| 20.20.20.1 | 56005 | sctp  | default_sctp-lb-default |    0 | rr  | default |              1 | On      |

LoxiLB nodes will advertise the service IP to all the BGP Peers. It can be confirmed as:

Client Node:

vagrant@host1:~$ sudo birdc show route
BIRD 2.0.8 ready.
Table master4:
30.30.30.0/24        unicast [my_routes 07:50:32.427] ! (200)
	via 192.168.90.9 on eth2
20.20.20.1/32        unicast [llb1 08:27:25.699] * (100) [i]
	via 192.168.90.252 on eth2
                     unicast [llb2 08:27:25.700] (100) [i]
	via 192.168.90.253 on eth2

K8s Node:

vagrant@master:~$ sudo ip route | grep bgp
20.20.20.1 via 192.168.80.252 dev eth1 proto bgp
30.30.30.0/24 via 192.168.80.252 dev eth1 proto bgp

Last but not the least, we can check traffic to the newly created iperf service by running iperf client from external host :

vagrant@host1:~$ iperf -c 20.20.20.1 -p 56002 -t 200 -i 1

Verify the connections' syncing in both the LoxiLB nodes:

vagrant@llb1:~$ sudo docker exec -it loxilb loxicmd get ct
|      SERVICE NAME      |     DESTIP     |     SRCIP      | DPORT | SPORT | PROTO | STATE |.....
|------------------------|----------------|----------------|-------|-------|-------|-------|.....
| default_tcp-lb-fullnat | 20.20.20.1     | 192.168.80.202 | 35914 | 32383 | tcp   | est   |.....
| default_tcp-lb-fullnat | 20.20.20.1     | 192.168.90.9   | 56002 | 35914 | tcp   | est   |.....

vagrant@llb2:~$ sudo docker exec -it loxilb loxicmd get ct
|      SERVICE NAME      |     DESTIP     |     SRCIP      | DPORT | SPORT | PROTO | STATE |.....
|------------------------|----------------|----------------|-------|-------|-------|-------|.....
| default_tcp-lb-fullnat | 20.20.20.1     | 192.168.80.202 | 35914 | 32383 | tcp   | est   |.....
| default_tcp-lb-fullnat | 20.20.20.1     | 192.168.90.9   | 56002 | 35914 | tcp   | est   |.....

Quick Demo of hitless failover

Watch this video to see the hitless failover of an iperf service. In this video you will see two instances of LoxiLB running as a cluster. One LoxiLB instance will acquire the MASTER state and the other will be BACKUP. A Client will try to access the service using the service IP and port, the connection will be established and traffic will start. MASTER LoxiLB will do the connection tracking and the session state will be synced with the BACKUP instance. You will notice that when the MASTER LoxiLB instance is stopped, the BACKUP instance will become the new MASTER and the connection between the client and the service is intact.

If interested users wishes to replicate this scenario in their lab then all the scripts and steps are available here.

LoxiLB

K8s - Deploying "hitless" and HA Load-balancing

Create a services using LoadBalancerClass

Other Readings:

Recent Posts

Comments

GIThub

GETTING STARTED

Documentation

JOIN US