Minio: S3-Compatible Object Storage in Your Cluster

Your own S3 endpoint, on hardware you control

Smarc Included in

03-03-2026 1830 words 9 min read

Minio: S3-Compatible Object Storage in Your Cluster

Contents

Half the software I self-host now expects an S3 bucket to throw things into. Backup tools, container registries, CI artifact stores, Loki for logs, photo libraries — they’ve all standardised on the S3 API as the lingua franca of “somewhere to put blobs”. You can hand all of them an AWS account and a credit card, or you can run Minio and have a real S3-compatible endpoint inside your own cluster, on your own disks, with no egress bills and no surprise invoices. I’ve run it for years and it has quietly become load-bearing infrastructure.

What Minio is and isn’t

Minio is a single Go binary that speaks the Amazon S3 API. Point any S3 client at it — the AWS CLI, rclone, an SDK, an app’s built-in S3 backend — and it works, because the API surface really is compatible. It does buckets, objects, presigned URLs, multipart uploads, bucket policies, and IAM-style access keys. For most applications that “support S3”, Minio is a drop-in target and they cannot tell the difference.

It is not a distributed filesystem and not a POSIX mount. Objects are objects: write the whole thing, read the whole thing, no partial in-place edits. That’s the S3 model, and if you fight it you’ll have a bad time. Used as intended — a place for immutable blobs — it’s rock solid.

The licensing thing you should know up front

Before you build anything on it, know the licence, because it trips people up. Minio has been released under the GNU AGPLv3 since 2021 — server, client, and gateway. For a homelab or internal cluster this changes nothing: AGPL only bites when you offer the modified software to third parties over a network, and self-hosting an unmodified binary for your own use is squarely fine. But if you’re evaluating it for a product you’ll ship or a service you’ll sell to others, read the licence properly or talk to Minio about a commercial one. I flag this not to scare you off — the AGPL is a perfectly reasonable licence — but because “I didn’t know” is a bad conversation to have with a lawyer later. For everything in this post, which is personal self-hosting, you’re clear.

A single-node start with Docker

You do not need a cluster to begin. A single container with one volume gets you a working endpoint and a web console:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
services:
  minio:
    image: quay.io/minio/minio:latest
    command: server /data --console-address ":9001"
    environment:
      MINIO_ROOT_USER: admin
      MINIO_ROOT_PASSWORD: "change-me-please-12345"
    ports:
      - "9000:9000"   # S3 API
      - "9001:9001"   # web console
    volumes:
      - minio-data:/data
    restart: unless-stopped

volumes:
  minio-data:

Port 9000 is the S3 API; 9001 is the browser console where you create buckets and access keys by hand. Once it’s up, the official mc client is the fast way to drive it from a terminal:

1
2
3
4
5
6
7
8
9
# register an alias for your server
mc alias set home http://localhost:9000 admin change-me-please-12345

# create a bucket and copy a file in
mc mb home/backups
mc cp ./db-dump.sql.gz home/backups/

# generate a scoped access key for an app, not root creds
mc admin user svcacct add home admin

That last command matters: never hand an application your root credentials. Create a service account scoped to the buckets it needs, exactly as you would with real IAM. The service account gets its own access key and secret, and you can attach a policy that only permits, say, s3:PutObject on the one bucket a backup job writes to. When that key inevitably leaks into a log file or a git commit six months from now, the blast radius is one bucket, not your whole storage layer.

While we’re on credentials: put the root user and password in a secret, not in a plaintext compose file that ends up in a repo. The example above hard-codes them for clarity, but in anything real you inject MINIO_ROOT_USER/MINIO_ROOT_PASSWORD from a .env file that’s gitignored, or from a Kubernetes secret. This is the same discipline you’d apply to any service holding data you care about — the object store is not special, and treating it as less sensitive than a database is a mistake, because increasingly it is your database’s backup target.

Going for real: erasure coding

The single-node toy above keeps your data on one disk with no redundancy. For anything you care about, Minio’s actual strength is erasure coding — it splits objects into data and parity shards across multiple drives so it can survive disk and node failures without RAID underneath. You need a minimum set of drives (it likes counts that divide evenly), and you give it all of them at once:

1
2
# distributed mode across 4 nodes, 4 drives each, expressed as an erasure set
minio server http://minio{1...4}/data{1...4} --console-address ":9001"

That brace-expansion syntax is Minio-native: it expands to sixteen drives across four hosts, and Minio computes the parity layout so you can lose drives — up to half, depending on the configured parity — and still read every object. This is the configuration I actually run, because the whole point of self-hosting storage is not waking up to a single dead disk having eaten your backups.

It’s worth understanding why erasure coding beats RAID for this job, because it’s not obvious. RAID protects against whole-disk failure but does nothing about silent bit rot — a flipped bit on a healthy disk sails straight through. Minio checksums every shard with a hash (HighwayHash) and verifies it on read, so it detects corruption and reconstructs the bad shard from parity automatically. This is bitrot protection, and for long-lived backup blobs it matters enormously: the whole point of a backup is that it’s still correct a year later when you’re panicking, and a subtly corrupted archive that restores without error but produces garbage is arguably worse than one that’s obviously gone. Minio’s read-time verification is the thing that lets you actually trust the data.

A related decision is how much parity to give it. The default splits an erasure set evenly between data and parity, which tolerates losing half the drives but halves usable capacity. You can dial parity down per-object or per-deployment if you’d rather trade some resilience for space — but for the write-once-read-rarely backup workload most homelabbers run, I leave the default alone. Storage is cheap; a failed restore at the worst possible moment is not.

On Kubernetes

In a cluster I deploy Minio as a StatefulSet so each pod gets stable identity and its own PersistentVolumeClaim. The official Helm chart handles the wiring, but the shape is simple: N replicas, one PVC each, a headless service for the peers to find each other, and a regular service for clients. The thing to get right is that the PVCs must be backed by local or block storage you trust — putting Minio’s data on a network filesystem that’s itself doing replication is doubling your overhead and confusing the failure model. Let Minio own the redundancy.

One operational note that has burned people: do not arbitrarily resize your erasure set after the fact, and don’t mix wildly different drive sizes. Minio sizes its parity layout at startup, and bolting on mismatched capacity later is not the seamless expand-a-RAID experience you might expect. Plan the layout, then grow by adding whole new server pools rather than fiddling with the existing set.

The storage backend matters more than the Helm chart does. This is the same decision you face for any stateful workload in a cluster, and I’ve written about persistent storage for Kubernetes with Longhorn versus OpenEBS at length — but for Minio specifically the answer is usually “the simplest local-path provisioner you can get away with”. Minio wants to own the redundancy itself, so wrapping its PVCs in a replicated storage layer means you’re storing every byte four or six times over, once for Minio’s erasure coding and again for the storage layer’s replication. Give it plain local volumes and let it do the maths it’s good at.

Troubleshooting: the failures you’ll actually hit

Minio is stable in steady state, but a few things go wrong during setup and they all look alarming until you know the cause.

“The specified bucket does not exist” from an app that clearly created one. Nine times out of ten this is a path-style versus virtual-host-style addressing mismatch. Minio serves buckets as http://host:9000/bucketname (path style); a lot of AWS SDKs default to virtual-host style (http://bucketname.host), which needs DNS you don’t have. Set the client’s forcePathStyle: true (or the equivalent flag) and it resolves instantly.

TLS errors when you point a strict client at it. Minio can run plain HTTP, and for internal traffic on a trusted network that’s a defensible choice. But some SDKs refuse non-TLS S3 endpoints outright. Either give Minio a certificate (mount it at /certs), or terminate TLS at your ingress and let Minio speak HTTP behind it — the pattern I use, and the same one described in my MetalLB and ingress writeup.

A pod stuck in CrashLoopBackOff complaining about drive count. In distributed mode Minio is fussy about the number of drives it can see at startup — if one PVC failed to bind, the erasure-set maths doesn’t add up and it refuses to start rather than silently running degraded. Check that every StatefulSet replica actually got its volume: kubectl get pvc -n minio. A single Pending claim will take the whole thing down, which feels harsh but is genuinely the safer behaviour.

mc says access denied on a key that should work. Service-account policies are strict by design. If a scoped key can list but not write, its policy is missing s3:PutObject; if it can’t even list, it’s missing s3:ListBucket on the bucket resource. Read the policy JSON, not the docs — the actual attached policy is the source of truth.

Is it worth it?

If you self-host anything that wants an S3 bucket — and increasingly everything does — yes, emphatically. One small Minio instance turns “this app needs AWS” into “this app needs a URL and an access key I generate locally”, which is a huge reduction in cloud dependency, cost, and data-residency hand-wringing. The single-node Docker setup takes ten minutes and immediately makes a dozen tools happier.

The caveats are real, though. Run it single-node and it’s a single point of failure; you must go erasure-coded and multi-drive before trusting it with data you can’t lose. And remember it’s object storage, not a filesystem — don’t try to mount it as a home directory. For backups, artifacts, logs, registry layers, and anything that fits the write-once-read-many blob pattern, Minio has been one of the best returns on a weekend’s setup in my whole stack — and one of the very few pieces I’d genuinely miss if I had to give it up and go back to renting a bucket from someone else.

Written by Smarc

Founder and editor of vo.rs. A lifelong tinkerer who self-hosts far more than is sensible, hardens Linux boxes for fun, and prods the latest AI tools to see what they can really do. The how-to guides here are the notes Smarc wishes had existed the first time round.

Tagged#minio #s3 #object-storage #self-hosting #kubernetes

Contents

Minio: S3-Compatible Object Storage in Your Cluster

Your own S3 endpoint, on hardware you control

What Minio is and isn’t

The licensing thing you should know up front

A single-node start with Docker

Going for real: erasure coding

On Kubernetes

Troubleshooting: the failures you’ll actually hit

Is it worth it?

Related Content

MinIO at Home: Your Own S3 Without the Bill

MinIO and the S3-Compatible Homelab

Automated Chaos: Using Fault Injection to Build Resilience Before Your Users Notice

Kaniko: Building Container Images Inside Kubernetes