Healthchecks.io (Self-Hosted): Making Sure Your Cron Jobs Actually Ran
Dead-man's-switch monitoring for the backups and scripts you never think about

Here is a story that has happened to almost everyone who has ever written a cron job. You set up a nightly backup. You test it once, it works, you feel responsible and adult. Eight months later you actually need that backup, and you discover it stopped running in March because of a full disk, an expired token, or a typo you made while “tidying up.” The cron job didn’t fail loudly. It failed silently, which is the worst way for anything to fail, and nobody told you because there was nobody to tell.
The problem with monitoring cron jobs is that you can’t watch for the thing going wrong — you have to watch for the thing not happening. That’s a dead man’s switch, and Healthchecks is the cleanest implementation of one I’ve found. There’s a hosted version at healthchecks.io, but the project is open source, and self-hosting it is both easy and faintly appropriate: monitoring your own infrastructure on your own infrastructure has a pleasing symmetry to it.
1 The pattern: a ping you expect
The idea is almost embarrassingly simple. You create a “check,” which gives you a unique URL. Your cron job, at the end of its run, makes an HTTP request to that URL. Healthchecks knows roughly how often it should hear from you, and if a ping doesn’t arrive on schedule, that silence is what triggers the alert.
So instead of trying to detect failure, you detect the absence of success. A backup that crashes never sends its ping, the expected window passes, and Healthchecks emails you to say “I haven’t heard from your backup in 25 hours.” It’s the opposite of normal monitoring and it’s exactly right for this job.
2 Adding it to a script
Wiring it in is a one-liner. The crudest version just curls the URL at the end of the crontab line:
30 2 * * * /usr/local/bin/backup.sh && curl -fsS -m 10 --retry 3 https://hc.example.com/ping/your-uuid-hereThe && is doing real work: the ping only fires if backup.sh exits successfully. If the script fails, no ping, and you get told.
But the better pattern signals start, success and failure explicitly, which gives you timing data and immediate failure alerts rather than waiting for the silence window:
#!/usr/bin/env bash
URL="https://hc.example.com/ping/your-uuid-here"
curl -fsS -m 10 "$URL/start" # mark the run as started
if /usr/local/bin/backup.sh; then
curl -fsS -m 10 "$URL" # success
else
curl -fsS -m 10 "$URL/fail" # explicit failure, alert now
fiThat’s the whole integration. Anything that can make an HTTP request — a shell script, a Python job, a Kubernetes CronJob, a Windows task — can report to Healthchecks.
3 Self-hosting it
The application is a Django project, so the moving parts are the app and a database. Postgres is the sensible choice for anything you care about:
services:
healthchecks:
image: healthchecks/healthchecks:latest
environment:
ALLOWED_HOSTS: hc.example.com
DB: postgres
DB_HOST: db
DB_NAME: healthchecks
DB_USER: hc
DB_PASSWORD: changeme
SITE_ROOT: https://hc.example.com
DEFAULT_FROM_EMAIL: [email protected]
EMAIL_HOST: smtp.example.com
ports:
- "8000:8000"
depends_on: [db]
db:
image: postgres:16
environment:
POSTGRES_DB: healthchecks
POSTGRES_USER: hc
POSTGRES_PASSWORD: changeme
volumes:
- hc_db:/var/lib/postgresql/data
volumes:
hc_db:The one bit of configuration you must not skip is email — get the SMTP settings right and test them, because an alerting system that can’t deliver alerts is an elaborate way of feeling falsely safe. Healthchecks also speaks to a long list of other channels: Slack, Telegram, webhooks, Pushover, and others, so you can route the “your backup is missing” message wherever you’ll actually see it.
4 The features that make it more than curl
Beyond the basic schedule, two things earn their keep. Grace periods let you say “this job should run hourly, but don’t panic until it’s 15 minutes late,” which stops a slightly slow run from paging you. And cron expressions mean you can describe a complicated schedule — “weekdays at 6am” — and Healthchecks understands exactly when to expect the next ping, rather than guessing from a simple interval.
There’s also a tidy dashboard showing every check’s status at a glance, the last few pings, and how long each run took. The first time you see a job that’s quietly drifted from “30 seconds” to “11 minutes” over a few months, you’ll understand why having the history matters.
5 The verdict
This is one of those rare tools where the effort-to-payoff ratio is almost insulting. An afternoon to self-host it, a one-line change per cron job, and you have closed off an entire category of silent, catastrophic failure — the backup that wasn’t, the certificate renewal that didn’t, the sync that stopped. If you run literally anything on a schedule that you’d be upset to discover had stopped, you want this. I added it to my backups years ago and the single most reassuring sound in my homelab is the one I never hear: the alert that doesn’t fire, because everything actually ran. Set it up before you need it, because by the time you need it, it’s already too late.




