Set Up Highly Available HAProxy Servers for ECS Cluster
- 1. Introduction
- 2. Prerequisites
- 3. Configure a Virtual IP Address
- 4. Install and Configure HAProxy
- 5. Install and Configure Keepalived
- 6. Start Up the Keepalived Service and Test Failover
- 7. Conclusion
1. Introduction
- The keepalived daemon can be used to monitor services or systems and to automatically failover to a standby if a failure occurs. VRRP (Virtual Router Redundancy Protocol) is a protocol for automatically assigning IP addresses to hosts.
- In this guide, we will demonstrate how to use keepalived to set up high availability for your load balancers. We will configure a floating IP address that can be moved between two capable load balancers. If the primary load balancer goes down, the floating IP will be moved to the second load balancer automatically, allowing Data Service to resume.
2. Prerequisites
The above figure shows two HAProxy servers, which are connected to an externally facing network (10.0.0/24) as 10.0.0.11 and 10.0.0.12 and to an internal network (192.168.1/24) as 192.168.1.11 and 192.168.1.12. One HAProxy server (10.0.0.11) is configured as a Keepalived master server with the virtual IP address 10.0.0.10 and the other (10.0.0.12) is configured as a Keepalived backup server. Three ECS servers, ecssvr1 (192.168.1.21) ecssvr2 (192.168.1.22) and ecssvr3 (192.168.1.23), are accessible on the internal network. The IP address 10.0.0.10 is in the private address range 10.0.0/24, which cannot be routed on the Internet.
In order to complete this guide, you will need to build two HAProxy servers and reserve a VIP (Virtual IP Address). On each of HAProxy servers, you will need a non-root user configured with sudo access.
3. Configure a Virtual IP Address
For most cloud platforms, security modules like port security and security group will require that packets sent/received from a VM port must have the fixed IP/MAC address of this VM port. This rule prevents arp spoofing, but also causes the VIP to be unable to contact other VMs.
For example, on the Openstack platform, an allowed address pair is needed when you identify a specific MAC address, IP address, or both to allow network traffic to pass through a port regardless of the subnet. When you define allowed address pairs, you are able to use protocols like VRRP (Virtual Router Redundancy Protocol) that float an IP address between two VM instances to enable fast data plane failover. See Creating Multi-Master BareMetal Cluster on Platform9 Managed OpenStack VMs.
Please contact your system administrator for assistance in configuring VIP.
4. Install and Configure HAProxy
- Next, we will set up the HAProxy load balancers. These will each sit in front of ECS server and split requests between the three ECS servers. These load balancers are completely redundant. Only one will receive traffic at any given time.
4.1 Install HAProxy
- The first step we need to take on our load balancers will be to install the haproxy package.
sudo yum install -y haproxy
sudo systemctl enable haproxy
4.2 Configure HAProxy
- The first item we need to modify when dealing with HAProxy is the /etc/haproxy/haproxy.cfg file.
cat > /etc/haproxy/haproxy.cfg << EOF
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
defaults
mode tcp
log global
option tcplog
option dontlognull
option redispatch
retries 3
maxconn 5000
timeout connect 5s
timeout client 50s
timeout server 50s
listen stats
bind *:8081
mode http
stats enable
stats refresh 30s
stats uri /stats
monitor-uri /healthz
frontend fe_k8s_80
bind *:80
default_backend be_k8s_80
backend be_k8s_80
balance roundrobin
mode tcp
server feng-ws1.sme-feng.athens.cloudera.com 192.168.1.21:80 check
server feng-ws2.sme-feng.athens.cloudera.com 192.168.1.22:80 check
server feng-ws3.sme-feng.athens.cloudera.com 192.168.1.23:80 check
frontend fe_k8s_443
bind *:443
default_backend be_k8s_443
backend be_k8s_443
balance roundrobin
mode tcp
server feng-ws1.sme-feng.athens.cloudera.com 192.168.1.21:443 check
server feng-ws2.sme-feng.athens.cloudera.com 192.168.1.22:443 check
server feng-ws3.sme-feng.athens.cloudera.com 192.168.1.23:443 check
EOF
- When you are finished making the above changes, save and close the file.
- Note: This file is exactly the same on both the primary haproxy server and the secondary haproxy server.
- Check that the configuration changes we made represent valid HAProxy syntax by typing:
sudo haproxy -f /etc/haproxy/haproxy.cfg -c
- If no errors were reported, restart your service on the two haproxy servers by typing:
sudo systemctl enable haproxy
sudo systemctl restart haproxy
5. Install and Configure Keepalived
5.1 Install Keepalived
Our infrastructure is not highly available yet because we have no way of redirecting traffic if our active load balancer experiences problems. In order to rectify this, we will install the keepalived daemon on our load balancer servers. This is the component that will provide failover capabilities if our active load balancer becomes unavailable.
Install the daemon by typing:
sudo yum install -y keepalived
sudo systemctl enable keepalived
- The daemon should be installed on both of the load balancer systems.
5.2 Create the Virtual IP Transition Scripts
- Next, we need to create a transition script that we can use to reassign the virtual IP address to the other HAProxy server whenever the local haproxy daemon abnormally interrupts.
sudo cat > /etc/keepalived/check_haproxy.sh << EOF
#!/bin/bash
HAPROXY_STATUS=\$(ps ax | grep -w [h]aproxy)
if [ "\${HAPROXY_STATUS}" == "" ]; then
echo "HAProxy is not running"
killall keepalived
else
echo "HAProxy is running"
fi
EOF
sudo chmod +x /etc/keepalived/check_haproxy.sh
5.3 Configure Keepalived for the Primary Load Balancer
- Next, on the load balancer server that you wish to use as your primary server, create the main keepalived configuration file. The daemon looks for a file called keepalived.conf inside of the /etc/keepalived directory.
sudo cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
script_user root
enable_script_security
}
vrrp_script check_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.10
}
track_script {
check_haproxy
}
}
EOF
- Note: Both state of load balancers are set to BACKUP, NOT a MASTER + a BACKUP. In this case, the state of the two load balancers are equal, and unnecessary switching can be avoided.
5.4 Configure Keepalived for the Secondary Load Balancer
- Next, we will create the companion script on our secondary load balancer. Create a file as /etc/keepalived/keepalived.conf on your secondary server:
sudo cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
script_user root
enable_script_security
}
vrrp_script check_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 90
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.10
}
track_script {
check_haproxy
}
}
EOF
- Inside, this script will be largely equivalent to the primary servers script. The items that we need to change are:
- priority: This should be set to a lower value than the primary server. We will use the value 90 in this guide.
6. Start Up the Keepalived Service and Test Failover
6.1 VIP was assigned to the Primary Load Balancer
- The keepalived daemon and all of its companion scripts should now be completely configured. We can start the service on both of our load balancers by typing:
sudo systemctl start keepalived
Each daemon will monitor the local HAProxy process, and will listen to signals from the remote keepalived process.
Your primary load balancer, which should have VIP assigned to it currently, will direct requests to each of ECS servers in turn.
[centos@haproxy1 ~]$ sudo systemctl status haproxy
haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 08:35:07 UTC; 2min 36s ago
[centos@haproxy1 ~]$ sudo systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 08:35:07 UTC; 2min 41s ago
[centos@haproxy1 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:91:13:76 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.11/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 68159sec preferred_lft 68159sec
inet 10.0.0.10/32 scope global eth0
valid_lft forever preferred_lft forever
[centos@haproxy2 ~]$ sudo systemctl status haproxy
haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 08:51:31 UTC; 22s ago
[centos@haproxy2 ~]$ sudo systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 08:51:32 UTC; 22s ago
[centos@haproxy2 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:77:e9:c5 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.12/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 69492sec preferred_lft 69492sec
[centos@haproxy2 ~]$ ping 10.0.0.10
PING 10.0.0.10 (10.0.0.10) 56(84) bytes of data.
64 bytes from 10.0.0.10: icmp_seq=1 ttl=64 time=1.25 ms
64 bytes from 10.0.0.10: icmp_seq=2 ttl=64 time=0.389 ms
64 bytes from 10.0.0.10: icmp_seq=3 ttl=64 time=0.330 ms
64 bytes from 10.0.0.10: icmp_seq=4 ttl=64 time=0.327 ms
64 bytes from 10.0.0.10: icmp_seq=5 ttl=64 time=0.327 ms
^C
--- 10.0.0.10 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4001ms
rtt min/avg/max/mdev = 0.327/0.525/1.253/0.364 ms
6.2 VIP was took over by the Secondary Load Balancer
We can test failover in a simple way by simply turning off HAProxy on our primary load balancer: sudo systemctl stop haproxy
Both HAProxy and Keepalived service went down on the primary load balancer, so this indicated that our secondary load balancer has taken over. Using keepalived, the secondary server was able to determine that a service interruption had occurred. It then transitioned to the active state and claimed the virtual IP.
[centos@haproxy1 ~]$ sudo systemctl status haproxy
haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Thu 2022-01-13 08:55:53 UTC; 12s ago
[centos@haproxy1 ~]$ sudo systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Thu 2022-01-13 08:55:54 UTC; 13s ago
[centos@haproxy1 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:91:13:76 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.11/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 67826sec preferred_lft 67826sec
[centos@haproxy1 ~]$ ping 10.0.0.10
PING 10.0.0.10 (10.0.0.10) 56(84) bytes of data.
64 bytes from 10.0.0.10: icmp_seq=1 ttl=64 time=0.366 ms
64 bytes from 10.0.0.10: icmp_seq=2 ttl=64 time=0.422 ms
64 bytes from 10.0.0.10: icmp_seq=3 ttl=64 time=0.449 ms
64 bytes from 10.0.0.10: icmp_seq=4 ttl=64 time=0.411 ms
64 bytes from 10.0.0.10: icmp_seq=5 ttl=64 time=0.365 ms
^C
--- 10.0.0.10 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 0.365/0.402/0.449/0.039 ms
[centos@haproxy2 ~]$ ping 10.0.0.10
PING 10.0.0.10 (10.0.0.10) 56(84) bytes of data.
64 bytes from 10.0.0.10: icmp_seq=1 ttl=64 time=1.25 ms
64 bytes from 10.0.0.10: icmp_seq=2 ttl=64 time=0.389 ms
64 bytes from 10.0.0.10: icmp_seq=3 ttl=64 time=0.330 ms
64 bytes from 10.0.0.10: icmp_seq=4 ttl=64 time=0.327 ms
64 bytes from 10.0.0.10: icmp_seq=5 ttl=64 time=0.327 ms
^C
--- 10.0.0.10 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4001ms
rtt min/avg/max/mdev = 0.327/0.525/1.253/0.364 ms
[centos@haproxy2 ~]$ sudo systemctl status haproxy
haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 08:51:31 UTC; 6min ago
[centos@haproxy2 ~]$ sudo systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 08:51:32 UTC; 6min ago
[centos@haproxy2 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:77:e9:c5 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.12/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 69131sec preferred_lft 69131sec
inet 10.0.0.10/32 scope global eth0
valid_lft forever preferred_lft forever
6.3 VIP was regained by the Primary Load Balancer
- We can start HAProxy and Keepalived on the primary load balancer again:
sudo systemctl start haproxy
sudo systemctl start keepalived
- At this time the secondary haproxy still holds the VIP. This is because the state of the two load balancers are equal.
[centos@haproxy1 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:91:13:76 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.11/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 67591sec preferred_lft 67591sec
[centos@haproxy2 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:77:e9:c5 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.12/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 69131sec preferred_lft 69131sec
inet 10.0.0.10/32 scope global eth0
valid_lft forever preferred_lft forever
- We can now stop HAProxy on the secondary load balancer:
sudo systemctl stop haproxy
- The primary load balancer will regain control of the virtual IP address immediately, and this will be rather transparent to the user.
[centos@haproxy1 ~]$ sudo systemctl status haproxy
haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 09:00:02 UTC; 3min 30s ago
[centos@haproxy1 ~]$ sudo systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-01-13 09:00:02 UTC; 3min 32s ago
[centos@haproxy1 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:91:13:76 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.11/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 67380sec preferred_lft 67380sec
inet 10.0.0.10/32 scope global eth0
valid_lft forever preferred_lft forever
[centos@haproxy2 ~]$ sudo systemctl status haproxy
haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Thu 2022-01-13 09:03:03 UTC; 1min 32s ago
[centos@haproxy2 ~]$ sudo systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Thu 2022-01-13 09:03:06 UTC; 1min 30s ago
[centos@haproxy2 ~]$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:77:e9:c5 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.12/22 brd 10.0.0.255 scope global dynamic eth0
valid_lft 69492sec preferred_lft 69492sec
[centos@haproxy2 ~]$ ping 10.0.0.10
PING 10.0.0.10 (10.0.0.10) 56(84) bytes of data.
64 bytes from 10.0.0.10: icmp_seq=1 ttl=64 time=0.440 ms
64 bytes from 10.0.0.10: icmp_seq=2 ttl=64 time=0.291 ms
64 bytes from 10.0.0.10: icmp_seq=3 ttl=64 time=0.396 ms
64 bytes from 10.0.0.10: icmp_seq=4 ttl=64 time=0.417 ms
64 bytes from 10.0.0.10: icmp_seq=5 ttl=64 time=0.360 ms
^C
--- 10.0.0.10 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 0.291/0.380/0.440/0.057 ms
7. Conclusion
- In this guide, we walked through the complete process of setting up a highly available, load balanced infrastructure. This configuration works well because the active HAProxy server can distribute the load to the pool of ECS servers on the backend.
- The virtual IP and keepalived configuration eliminates the single point of failure at the load balancing layer, allowing your service to continue functioning even when the primary load balancer completely fails. This configuration is fairly flexible and can be adapted to your own application environment by setting up your preferred web stack behind the HAProxy servers.