Loki: Log Aggregation for People Who Can't Afford Splunk
Grep-able logs from every box, indexed by labels instead of by every word

There are two kinds of homelabber: the ones who SSH into each box and run journalctl when something breaks, and the ones who got tired of doing that around the fourth machine. I crossed that line a while ago. Once you have a handful of hosts and a stack of Docker containers, “which log, on which box, from which container?” becomes a small archaeological dig every single time, usually conducted in a hurry while something is on fire.
The grown-up answer to this is log aggregation: ship every log line to one place, and search them all at once. The grown-up price for that has historically been Splunk, which is superb and costs roughly the GDP of a small island once your volume gets serious. Loki, from the Grafana people, is the answer for the rest of us.
1 Loki’s clever, slightly weird idea
The reason traditional log systems are expensive is that they index everything. Every word in every line gets put into a full-text index so you can search for it later, and that index is enormous — often bigger than the logs themselves. It’s powerful and it’s why your wallet hurts.
Loki does something deliberately different and a bit cheeky: it doesn’t index the log content at all. It only indexes a small set of labels — things like which host, which container, which job — exactly the way Prometheus indexes metrics. The actual log text is just compressed and dumped into chunks of object storage. When you search, you first narrow down by labels to a small set of streams, then Loki brute-force greps through only those chunks.
The trade-off is explicit: cheap storage and cheap ingestion, in exchange for searches that are fast if you label well and slow if you ask it to grep across everything. For a homelab, where “everything” is modest, this is a brilliant bargain.
2 The three pieces
A Loki setup has three moving parts, and it helps to name them:
- Loki itself — the server that stores chunks and answers queries.
- Promtail (or, increasingly, the Grafana Alloy agent) — the thing that runs on each host, tails log files, attaches labels, and ships lines to Loki.
- Grafana — the same Grafana you already run for metrics, which gets a “Logs” view and a query language called LogQL.
That last point is the killer feature: logs and metrics live in the same Grafana, so you can spot a spike on a graph and pivot straight to the log lines from that exact minute. No context switch, no second tool.
3 A minimal stack
Loki ships sensible single-binary defaults now, so you don’t need to understand its internal microservices. Here’s a compose file that runs Loki and a Promtail that scrapes Docker container logs:
services:
loki:
image: grafana/loki:latest
command: -config.file=/etc/loki/config.yml
volumes:
- ./loki-config.yml:/etc/loki/config.yml
- loki_data:/loki
ports:
- "3100:3100"
promtail:
image: grafana/promtail:latest
command: -config.file=/etc/promtail/config.yml
volumes:
- ./promtail-config.yml:/etc/promtail/config.yml
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
volumes:
loki_data:Promtail’s config is where you decide what labels exist — and this is the part that actually matters, because your labels are your whole search index:
scrape_configs:
- job_name: docker
docker_sd_configs:
- host: unix:///var/run/docker.sock
relabel_configs:
- source_labels: ['__meta_docker_container_name']
target_label: 'container'
- source_labels: ['__meta_docker_container_log_stream']
target_label: 'stream'4 LogQL, and the cardinality trap
In Grafana you add Loki as a data source and start querying. LogQL looks pleasantly like a hybrid of grep and PromQL. A query to find errors from a specific container in the last hour:
{container="caddy"} |= "error" | json | status >= 500That {container="caddy"} part picks the stream by label; |= "error" greps; the rest parses JSON and filters. You can even turn logs into metrics on the fly — count_over_time to graph error rates — which is genuinely magic the first time you do it.
The one rule you must internalise: never put high-cardinality values in labels. Putting a user ID, a request ID, or a timestamp into a label will explode the number of streams and bring Loki to its knees — it’s the single most common way people wreck their installation. Labels are for the handful of dimensions you slice by; everything else stays in the log line, where the grep can find it.
5 Where it falls short
Loki is not Splunk, and it’s honest about that. If your daily job is needle-in-a-haystack full-text search across terabytes with no idea which service produced the line, Loki’s brute-force model will feel slow and you’d genuinely be better served by something with a full index. Its query language, while improving, still has rough edges. And early Loki setups had a reputation for fiddly configuration that scared people off — that’s much better now, but the folklore lingers.
6 The verdict
For a self-hoster who already runs Grafana and Prometheus, Loki is close to a no-brainer. It’s cheap to run, it puts every log from every box behind one search bar, and it lives in the dashboard you already have open. The discipline it demands — keep labels low-cardinality — is the same discipline that keeps Prometheus healthy, so you probably already have the right instincts. I added it to my stack expecting a weekend of pain and got a working “search all my logs” box by lunchtime. The first time an outage took two minutes to diagnose instead of twenty, it had paid for itself.




