Grafana and Prometheus: A Monitoring Stack That Scales Down
The dashboard pairing that powers data centres, running happily on a single Pi

There is a particular flavour of homelab anxiety that strikes at 3am: is the NAS still alive? Did the disk fill up while I slept? Is that container restarting in a loop, quietly burning through SD-card writes? You can answer these questions by SSHing in and squinting at df and top, or you can answer them with a graph. I am firmly in the graph camp, and for the better part of a decade the graph has come from Grafana fed by Prometheus.
The interesting thing about this pairing is that it’s the same stack large companies use to watch thousands of nodes, yet it scales all the way down to a single Raspberry Pi watching itself. You don’t outgrow it, and you don’t have to be Google to justify it.
1 Two tools, two jobs
People say “Grafana and Prometheus” as if they were one product, but they do quite different things and it pays to keep them straight in your head.
Prometheus is the database and the collector. It does one trick extremely well: every fifteen seconds (or whatever interval you set) it reaches out to a list of HTTP endpoints, scrapes a page of plain-text metrics, and stores them as time series. It’s a pull model — Prometheus goes and fetches; the things being monitored don’t push anything. Each metric is a number with labels, like node_filesystem_avail_bytes{mountpoint="/"}, and Prometheus keeps the history.
Grafana is the eyes. It doesn’t store anything of consequence; it queries Prometheus (and dozens of other data sources) and draws the pretty dashboards. The split matters because you can restart Grafana, break a dashboard, or upgrade it recklessly without losing a single data point — your history lives safely in Prometheus.
2 The thing that produces the numbers
Prometheus scrapes endpoints, but something has to expose those endpoints. That something is an exporter. The one you’ll install first is node_exporter, which turns a Linux box’s CPU, memory, disk, network and load into Prometheus metrics. For Docker stats there’s cAdvisor; for a hundred other things there’s an exporter on GitHub maintained by someone who also couldn’t sleep at 3am.
Here’s a minimal stack that monitors the host it runs on:
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prom_data:/prometheus
ports:
- "9090:9090"
node-exporter:
image: prom/node-exporter:latest
pid: host
volumes:
- /:/host:ro,rslave
command:
- '--path.rootfs=/host'
grafana:
image: grafana/grafana:latest
volumes:
- graf_data:/var/lib/grafana
ports:
- "3000:3000"
volumes:
prom_data:
graf_data:And the prometheus.yml that tells it what to scrape:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']That’s the whole thing. Bring it up, open Grafana on port 3000, add Prometheus (http://prometheus:9090) as a data source, and you have monitoring.
3 Dashboards you don’t have to build
The fear with Grafana is that you’ll spend a weekend dragging panels around. You won’t, because the community has already done it. Grafana’s dashboard library has thousands of pre-built dashboards you import by pasting a numeric ID. The famous Node Exporter Full dashboard (ID 1860) gives you a wall of beautifully laid-out graphs for everything node_exporter produces. Import it, point it at your data source, done. You can build your own later when you have opinions, but you’ll start with something that looks like a NASA control room for zero effort.
4 Alerts, because graphs you don’t look at are useless
A dashboard only helps if you happen to be staring at it. The real payoff is alerting: get told before the disk fills. Prometheus uses PromQL, a query language that’s genuinely pleasant once it clicks. A rule that fires when a root filesystem drops below 10% free looks like this:
groups:
- name: disk
rules:
- alert: DiskAlmostFull
expr: node_filesystem_avail_bytes{mountpoint="/"}
/ node_filesystem_size_bytes{mountpoint="/"} < 0.10
for: 10m
labels:
severity: warning
annotations:
summary: "Root filesystem under 10% free"The for: 10m clause is the unsung hero here — it stops a momentary blip from waking you. Pair Prometheus with Alertmanager to route those alerts to email, a webhook, or your phone, and the 3am anxiety finally has somewhere to go that isn’t your own imagination.
5 The honest costs
It’s not all free. Prometheus stores everything locally and is not built for years of retention — by default it keeps fifteen days, and long-term storage means bolting on extra machinery. PromQL has a learning curve; your first few queries will be cargo-culted from Stack Overflow. And the pull model means Prometheus needs network reach to everything it watches, which gets fiddly across firewalls and NAT.
For a homelab none of that bites hard. Fifteen days of history is plenty, the queries you need are mostly already written, and everything lives on one network anyway.
6 The verdict
If you run more than one always-on machine and you’ve ever been surprised by an outage you could have seen coming, this stack is worth the afternoon. It’s the rare piece of “enterprise” software that genuinely scales down: light enough for a Pi, capable enough to never replace. I’ve run it for years, it sips resources, and it has turned more than one would-be 3am disaster into a graph I noticed at a civilised hour. That’s exactly what I want from monitoring.




