mTLS: Mutual TLS Between Services Without a Service Mesh

Both ends prove who they are, and you don't need Istio to do it

Smarc Included in

11-12-2025 1931 words 10 min read

mTLS: Mutual TLS Between Services Without a Service Mesh

Contents

The incident that made me care about mutual TLS was embarrassingly mundane. A metrics scraper on one of my hosts had been quietly hitting an internal API for months. When I finally audited what could actually reach that API, the answer was: anything on the same network segment. The service checked no identity at all. It answered a request the way a shop assistant answers a stranger — politely, completely, and with no idea who it was talking to. Nothing had gone wrong yet. But “nothing has gone wrong yet” is not a security posture, it’s a countdown.

Ordinary TLS doesn’t help here, and understanding why is the whole point. Ordinary TLS proves the server’s identity to the client. Your browser checks the certificate, sees a name it trusts, and gets on with the conversation. The server, meanwhile, has no idea who the client is — it’ll talk to anyone who completes the handshake. For traffic between services inside your own infrastructure that relationship is backwards. You usually care far more about which client is calling than the client cares about the server. Mutual TLS fixes exactly that: both ends present certificates, both ends verify, and an unauthenticated caller never gets past the handshake, let alone to your application code.

The usual advice at this point is “deploy a service mesh.” If you’re already running Kubernetes at scale, with Istio or Linkerd earning their keep on traffic shifting and observability, then fine — mTLS comes along for free and you should use it. But for a handful of services on a few VMs, dragging in a mesh purely to get client-certificate authentication is using a crane to hang a picture. The sidecar-per-pod overhead, the control plane, the debugging surface — all of it, for something you can do with a private CA and a few lines of proxy config. Let me show you the small version.

Why a private CA is the whole trick

mTLS is built on a private certificate authority. Every service gets a certificate signed by your CA, and every service is configured to trust only that CA. A certificate signed by anyone else — including the entire public web PKI, Let’s Encrypt included — is rejected outright. This is the inversion that makes it secure: there is no path for an outsider to obtain a certificate your services will accept, because the only key that can sign an acceptable one is sitting on a machine you control. Compare that to a shared secret or an API token, which leaks the moment it appears in one log file or one screenshot.

I reach for step-ca or cfssl in anything approaching production, for reasons I’ll get to. But for a small setup — and to actually understand what’s happening rather than cargo-culting a tool — plain openssl is enough. First the CA itself, then a certificate for a service:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Root CA key and self-signed cert (10 years)
openssl req -x509 -newkey rsa:4096 -nodes \
  -keyout ca.key -out ca.crt -days 3650 \
  -subj "/CN=mylab-internal-ca"

# A server cert request for the api service
openssl req -newkey rsa:2048 -nodes \
  -keyout api.key -out api.csr \
  -subj "/CN=api.mylab.local"

# Sign it with the CA, valid 1 year
openssl x509 -req -in api.csr \
  -CA ca.crt -CAkey ca.key -CAcreateserial \
  -out api.crt -days 365

Repeat the last two steps for each client, changing the CN. That CN — or better, a Subject Alternative Name, which modern verifiers increasingly insist on — becomes the identity you can authorise on later. Guard ca.key like it’s the master key to the building, because it is: anyone who holds it can mint a certificate every one of your services will trust. It should never leave the CA host, and ideally lives on something with restricted access rather than sitting in the same directory as everything else.

Enforcing it at the edge with nginx

The cleanest place to require client certificates is usually the reverse proxy that already fronts the service. If you’ve ever set up automated server certificates — the way cert-manager keeps public TLS renewing itself — this is the same machinery pointed inward, with one extra directive that flips verification from optional to mandatory. nginx does mTLS with two lines: point it at the CA bundle and turn client verification on.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
server {
    listen 443 ssl;
    server_name api.mylab.local;

    ssl_certificate     /etc/ssl/api.crt;
    ssl_certificate_key /etc/ssl/api.key;

    # Require and verify the client certificate
    ssl_client_certificate /etc/ssl/ca.crt;
    ssl_verify_client      on;
    ssl_verify_depth       1;

    location / {
        # Pass the verified client identity upstream
        proxy_set_header X-Client-CN $ssl_client_s_dn_cn;
        proxy_pass http://127.0.0.1:8080;
    }
}

With ssl_verify_client on, nginx terminates any connection that doesn’t present a certificate signed by your CA — before a single byte reaches the application behind it. That’s the part I like: the app doesn’t need to know anything about certificates. The $ssl_client_s_dn_cn variable hands the verified Common Name to your backend as a header, so the application can layer authorisation (“is metrics.mylab.local allowed to call /admin?”) on top of the authentication nginx has already done. Keep the two ideas separate in your head: authentication is “who are you”, authorisation is “what are you allowed to do”, and mTLS gives you a trustworthy answer to the first so you can build the second.

Prove it actually fails closed

The entire value proposition is that an unauthenticated request fails. So test that it does, because a verification directive that’s silently not working looks identical to one that is until the day it matters.

1
2
3
4
5
6
7
8
# No client cert: rejected at the handshake
$ curl https://api.mylab.local/health
curl: (56) OpenSSL/3.0 error: tlsv13 alert certificate required

# With a valid client cert: through to the app
$ curl --cert client.crt --key client.key \
       --cacert ca.crt https://api.mylab.local/health
{"status":"ok"}

If that first command succeeds, your verification isn’t really on, and you’ve built a wall with a door propped open behind it. The usual culprit is that the application is listening on a port that bypasses the proxy entirely — bind your service to 127.0.0.1 so nginx is the only way in, or you’ve authenticated the front door while leaving the back one wide.

Naming identities properly with SANs

A word on the CN shortcut I’ve been leaning on, because it will eventually catch you out. The Common Name field is a legacy of an older era of certificates, and modern TLS stacks — Go’s in particular, since Go 1.15 — deliberately ignore it for hostname verification and look only at the Subject Alternative Name. If you sign certificates with just a CN and no SAN, a browser might grumble and a curl might still work, but a Go service verifying its peer will reject the certificate with x509: certificate relies on legacy Common Name field, and you’ll lose an hour deciding your CA is broken when it’s your extension policy that’s wrong.

The fix is to put the identity in a SAN. With openssl you supply an extensions file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# san.cnf
[req]
distinguished_name = dn
req_extensions     = v3
[dn]
[v3]
subjectAltName = DNS:api.mylab.local

# then, at signing time:
openssl x509 -req -in api.csr \
  -CA ca.crt -CAkey ca.key -CAcreateserial \
  -extfile san.cnf -extensions v3 \
  -out api.crt -days 365

Once the name lives in a SAN, every verifier — browser, curl, Go, Python — agrees on it, and you can use that name as the authorisation key upstream with confidence. This is also the point where hand-rolled openssl starts feeling like a chore, because managing SAN files per service by hand is exactly the sort of fiddly, error-prone bookkeeping that tooling exists to remove. Keep that thought; it becomes the argument of the next section.

The part everyone underestimates: rotation

Issuing certificates is the easy 20%. The hard 80% is what happens when they expire — and they will, all at once, at 3am, because you set them all to 365 days on the same afternoon eleven months ago and forgot. An expired certificate takes a service down as surely as a crashed process, except the error message points at TLS rather than your code and you lose twenty minutes working that out. Worse, a stale revocation list means a certificate you thought you’d killed keeps working, which quietly undoes the entire point of having a CA you control.

This is the honest argument for tooling, and where step-ca earns its place. Run as an online CA, it issues short-lived certificates — hours, not months — with automated renewal built in, so rotation becomes the boring default rather than a 3am emergency. Short lifetimes also make revocation lists mostly unnecessary: a compromised certificate that expires in an hour is a much smaller problem than one valid for another eleven months. If you insist on staying with static OpenSSL certificates, then you must put expiry monitoring on them — Uptime Kuma checks certificate expiry natively and will nag you well before the cliff — and write down the reissue procedure while it’s fresh, not while it’s on fire. “We’ll remember” is precisely how you discover, in production, that you won’t.

Common ways it goes wrong

A few failure modes account for most of the lost afternoons:

SSL certificate problem: unable to get local issuer certificate on the client. The client isn’t trusting your CA. Pass --cacert ca.crt, or install the CA into the client’s trust store. This is not an mTLS problem, it’s the ordinary server-verification half still needing your private root.
Verification passes when it shouldn’t. Almost always the request reached the app directly, not through the proxy. Check what your service is bound to and confirm the proxy is the only reachable path.
Clock skew. Certificates are time-bound. A host whose clock has drifted will reject perfectly valid certificates as “not yet valid” or “expired”. Run NTP on everything; this bites hardest on VMs that were suspended and resumed.
SAN vs CN mismatches. Newer clients ignore the CN for hostname matching and want a Subject Alternative Name. If verification fails on a cert that looks correct, add the name as a SAN rather than relying on the CN alone.

Is it worth it without a mesh?

For a small estate — say, fewer than a dozen services across a handful of hosts — genuinely yes. You get strong, cryptographic service identity for the price of a CA and a couple of proxy directives, and you sidestep the operational weight of a mesh sidecar riding along with every workload. It composes cleanly with the rest of a modest setup: services fronted by a proxy, secrets you actually control, monitoring on the things that expire.

The break-even point is rotation, not enforcement. Turning verification on is a five-minute job; keeping certificates fresh across a growing fleet is the part that eventually justifies real tooling. So my rule is simple. Start with a private CA and nginx. The moment manual renewal becomes a recurring chore — the moment you have enough services that “which certs expire this month” is a question you have to look up — adopt step-ca or an equivalent for automated short-lived certificates. Reach for a full service mesh only when you also want the traffic management, retries, and observability that justify its weight. mTLS on its own does not, and pretending otherwise is how homelabs end up running a control plane to protect three containers.

Written by Smarc

Founder and editor of vo.rs. A lifelong tinkerer who self-hosts far more than is sensible, hardens Linux boxes for fun, and prods the latest AI tools to see what they can really do. The how-to guides here are the notes Smarc wishes had existed the first time round.

Tagged#mtls #tls #security #networking #self-hosting

Contents

mTLS: Mutual TLS Between Services Without a Service Mesh

Both ends prove who they are, and you don't need Istio to do it

Why a private CA is the whole trick

Enforcing it at the edge with nginx

Prove it actually fails closed

Naming identities properly with SANs

The part everyone underestimates: rotation

Common ways it goes wrong

Is it worth it without a mesh?

Related Content

mTLS Between Your Own Services, Demystified

DNS Sinkholing: Blocking Malware Domains at the Network Level

Cloudflare Tunnel: Convenience and Its Cost

Stalwart: Self-Hosting Your Own Email Server (and Why You Probably Shouldn't)