Keepalived and Virtual IPs: High Availability Without a Load Balancer

One floating address, two machines, and a failover that nobody notices

Smarc Included in

02-04-2024 1839 words 9 min read

Keepalived and Virtual IPs: High Availability Without a Load Balancer

Contents

There is a particular flavour of homelab anxiety that arrives the moment a single box becomes load-bearing. Your reverse proxy, your DNS resolver, your little MQTT broker that the whole house now depends on — all of it pinned to one IP address on one machine that will, one day, want a kernel update at the worst possible time. The cloud answer to this is a managed load balancer, and it is a fine answer if you enjoy paying monthly for a TCP forwarder. The self-hosted answer, and a remarkably good one, is keepalived.

Keepalived implements VRRP — the Virtual Router Redundancy Protocol — which is the boring, battle-tested mechanism that lets two or more machines share a single floating IP. One node holds the address; if it falls over, another grabs it within a second or two. Your clients keep talking to the same IP and never know anything happened. No load balancer, no DNS round-robin with its glacial TTLs, no application-layer cleverness. Just an IP that refuses to die. I’ve used this to make reverse-proxy reboots invisible and to front a pair of K3s control-plane nodes, and in both cases the failover was faster than anyone noticed.

How VRRP actually works

Each participating node runs keepalived and joins a virtual router identified by a number (the VRID). Within that group, nodes elect a MASTER based on a priority value — highest wins. The MASTER claims the virtual IP (VIP) and broadcasts VRRP advertisements over multicast, roughly once a second by default. The BACKUP nodes sit quietly and listen.

The instant the BACKUP stops hearing those advertisements — because the MASTER crashed, lost its link, or someone tripped over a cable — it promotes itself, claims the VIP, and fires off a gratuitous ARP so the switch updates its MAC table immediately. That gratuitous ARP is the magic: it tells the network “the IP you knew lives at this MAC now,” and traffic redirects without waiting for ARP caches to expire. Without it, clients would keep sending frames to the dead node’s MAC until their caches timed out — potentially minutes. With it, the switch relearns the port in milliseconds.

One detail worth internalising: VRRP is a failover protocol, not a load-balancing one. At any instant exactly one node owns the VIP and serves everything. The others are warm spares. That constraint shapes everything else about how you use it.

A minimal two-node setup

Say you have two nginx boxes, 10.0.0.11 and 10.0.0.12, and you want them to share 10.0.0.10. On the primary, /etc/keepalived/keepalived.conf:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 150
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass changeme-or-vrrpv3
    }

    virtual_ipaddress {
        10.0.0.10/24
    }
}

On the secondary, the file is identical except state BACKUP and priority 100. The lower priority means it only takes over when the primary is gone. Two things must match across both nodes or you’ll get a split brain: the virtual_router_id and the auth_pass. Get the VRID wrong and each node thinks it’s alone; get the password wrong and they refuse to talk. Start the service on both:

1
2
3
$ sudo systemctl enable --now keepalived
$ ip addr show eth0 | grep 10.0.0.10
    inet 10.0.0.10/24 scope global secondary eth0

That secondary flag on the master is keepalived having added the VIP. Pull the plug on the master and watch the journal on the backup:

1
2
3
4
5
$ journalctl -u keepalived -f
Keepalived_vrrp[1421]: (VI_1) Backup received priority 0 advertisement
Keepalived_vrrp[1421]: (VI_1) Entering MASTER STATE
Keepalived_vrrp[1421]: (VI_1) setting VIPs.
Keepalived_vrrp[1421]: Sending gratuitous ARP on eth0 for 10.0.0.10

Under a second. Your ping 10.0.0.10 might drop a single packet.

Track the service, not just the host

The naive setup above fails over only when the whole host dies. But the far more common failure is the service dying while the host stays up — nginx segfaults, the VIP stays put, and you’re serving connection-refused with a healthy-looking master. This is the single biggest mistake people make with keepalived: they test by pulling a power cable, it works beautifully, and then a crashed daemon leaves them down for hours because the box is still pinging. Fix this with a vrrp_script that polls the thing you actually care about:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
vrrp_script chk_nginx {
    script "/usr/bin/killall -0 nginx"
    interval 2
    weight -60
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    # ... as above ...
    track_script {
        chk_nginx
    }
}

killall -0 doesn’t kill anything — it just tests whether the process exists, returning non-zero if not. When the check fails twice, keepalived subtracts from the master’s priority. Here’s the trap: get the weight wrong and nothing happens. Subtract 40 from the master’s 150 and you get 110, which is still above the backup’s 100 — so the master keeps the VIP despite the dead service. Set weight -60 and the master drops to 90, the backup’s 100 wins, and failover happens on a dead daemon, not just a dead box. Always make the weight large enough to actually demote below your backups, and test it by killing the service, not the machine.

For anything more than a liveness check, point the script at a real health endpoint — curl -fsS http://localhost/healthz catches a process that’s alive but wedged, which a bare killall -0 never will.

Troubleshooting: split brain and the silent failures

Both nodes claim MASTER. This is a split brain, and it’s the classic keepalived failure. Both nodes ARP for the VIP, the switch flaps between two MACs, and connections break intermittently in a way that’s maddening to diagnose. It almost always means the nodes can’t see each other’s VRRP advertisements. Verify with tcpdump -i eth0 vrrp on both — you should see the other node’s advertisements. If you don’t, suspect a firewall dropping the VRRP protocol (IP protocol 112, not a TCP/UDP port), a switch filtering multicast, or the two nodes being on different layer-2 segments.

Firewall eating VRRP. VRRP isn’t TCP or UDP; it’s its own IP protocol number (112), sent to the IANA-reserved VRRP multicast group in the link-local range. Rules written for “allow port X” won’t cover it, because there is no port — the protocol number is the thing your firewall has to permit. On ufw you need to allow that multicast destination and explicitly permit protocol 112, or you’ll get a permanent split brain that looks for all the world like a config error. This is the single most common cause of “it worked on my flat test network and broke on the real one,” because home routers and managed switches are far more likely to filter multicast than a simple unmanaged switch.

Failover works but clients hang anyway. If the VIP moves but existing connections don’t recover, remember VRRP only moves the address — it does not migrate TCP state. Long-lived connections (SSH, a database session) will drop and must reconnect. That’s expected; design clients to retry.

Preemption surprises. By default a recovered higher-priority node takes the VIP back, causing a second blip when the master returns. If that bothers you, set nopreempt on the instances (both must be state BACKUP for it to apply cleanly) so whoever holds the VIP keeps it until they fail.

Notifications and doing something on failover

Failover being invisible is the goal, but silent failover is a different thing — you want to know it happened, even if clients didn’t notice, because a failover means one of your nodes is unwell. Keepalived runs a script whenever the state changes, which is the hook for both alerting and any side-effects you need:

1
2
3
4
5
6
vrrp_instance VI_1 {
    # ...
    notify_master "/etc/keepalived/on_master.sh"
    notify_backup "/etc/keepalived/on_backup.sh"
    notify_fault  "/etc/keepalived/on_fault.sh"
}

on_master.sh might fire a webhook to your chat, restart a service that only the active node should run, or update a DNS record. The notify_fault case is the one people forget: it fires when the node itself decides it’s unhealthy (a tracked script failed), which is often your earliest warning that a service is degrading before it fully dies. Wire at least a notification into all three and you turn keepalived from a silent safety net into something that tells you when it caught you.

Unicast, when multicast isn’t an option

The multicast requirement is keepalived’s biggest deployment constraint, but modern versions support unicast VRRP, which sidesteps a lot of it. Instead of broadcasting to the group, each node lists its peers explicitly:

1
2
3
4
5
6
7
vrrp_instance VI_1 {
    # ... state, priority, VIP as before ...
    unicast_src_ip 10.0.0.11
    unicast_peer {
        10.0.0.12
    }
}

This works across some environments that filter multicast, and it’s more predictable on managed switches that treat multicast oddly. It doesn’t rescue you on cloud providers that block VRRP entirely — they’re blocking the protocol, not just the delivery method — but on a stubborn on-prem switch, unicast is often the difference between “works” and “permanent split brain.” The trade-off is that you now maintain a peer list, so it scales worse than multicast for large groups; for a two- or three-node homelab that’s a non-issue.

The sharp edges

VRRP uses multicast, so it wants a flat layer-2 segment. It will not traverse a router without help, and some cloud networks and managed switches block multicast or VRRP outright — AWS and most VPS providers among them, which is precisely why they sell you load balancers. And the VIP gives you availability, not capacity — only one node serves traffic at a time. If you need to spread load rather than just survive a failure, keepalived is the wrong layer; you want a real load balancer in front. On a bare-metal Kubernetes cluster that’s often MetalLB, which solves the “LoadBalancer service with no cloud” problem directly, and can even share the same VRRP-style approach internally.

Where keepalived shines is fronting a control plane. If you’re building a genuinely highly-available K3s cluster with three server nodes, a floating VIP in front of the API server is the standard way to give kubectl and your agents one stable address — it pairs naturally with the multi-server setup I describe in adding nodes to a K3s cluster.

Verdict

For a homelab or a small on-prem fleet where you control the network and just want a service that survives a reboot, keepalived is close to perfect: a hundred lines of config, no moving parts, no recurring bill, and failover faster than a human can notice. Track the service and not just the host, size your vrrp_script weight so it actually demotes, and make sure VRRP can cross your network before you trust it. If you’re on a cloud VPS that filters multicast, or you genuinely need to balance load across nodes rather than just survive one dying, this isn’t your tool. But for “make this one IP immortal,” nothing beats it for effort-to-payoff.

Written by Smarc

Founder and editor of vo.rs. A lifelong tinkerer who self-hosts far more than is sensible, hardens Linux boxes for fun, and prods the latest AI tools to see what they can really do. The how-to guides here are the notes Smarc wishes had existed the first time round.

Tagged#networking #self-hosting #linux

Contents

Keepalived and Virtual IPs: High Availability Without a Load Balancer

One floating address, two machines, and a failover that nobody notices

How VRRP actually works

A minimal two-node setup

Track the service, not just the host

Troubleshooting: split brain and the silent failures

Notifications and doing something on failover

Unicast, when multicast isn’t an option

The sharp edges

Verdict

Related Content

mDNS and Avahi: Local Service Discovery That Works Until It Doesn't

Tailscale: A Zero-Config Mesh VPN for People Who Hate Networking

Borg vs Restic: Painless Encrypted Backups You'll Actually Run

MQTT Essentials: The Protocol Behind Every Smart Home