SBOM: Software Bill of Materials and Why You Should Care About Your Dependencies

Knowing what is actually in your software before someone else tells you

Smarc Included in

21-11-2025 2033 words 10 min read

SBOM: Software Bill of Materials and Why You Should Care About Your Dependencies

Contents

Every time a serious supply-chain vulnerability lands, the same scramble begins. Someone in a chat channel asks “are we affected?” and the honest answer, for most teams, is “give us a few days and we’ll tell you.” When Log4Shell (CVE-2021-44228) broke in December 2021, that scramble ran for weeks across the entire industry, because a logging library buried five levels deep in build trees turned out to be everywhere and nobody had a map. That few days — or few weeks — is the gap an SBOM is meant to close. A Software Bill of Materials is just an inventory — a machine-readable list of every component that went into a build — but having one ready before the panic is the difference between an afternoon and a fortnight.

I’ll say up front that for a solo homelab this is probably overkill, and I’ll come back to who actually needs it. But if you ship software to anyone else, the case for an SBOM has gone from “nice to have” to “someone in procurement is about to ask for one,” and it’s worth understanding before that email arrives.

What an SBOM actually is

Strip away the acronym and an SBOM is a manifest. For each thing you ship, it lists the packages, libraries, and transitive dependencies that made it in, ideally with versions, licences, and cryptographic hashes. Think of it as the ingredients label on a packet of food, except the ingredients have their own ingredients, several layers down, and any one of them can poison the batch.

Two formats dominate: SPDX (an established Linux Foundation standard, now an ISO/IEC standard, good for licence compliance) and CycloneDX (born in the OWASP security community, strong on vulnerability tooling). Both are interchangeable enough that most tools can convert between them, so don’t agonise over the choice — pick CycloneDX if security is your driver, SPDX if licence and compliance reporting is. The regulatory backdrop is worth knowing too: the US Executive Order 14028 pushed SBOMs into federal procurement, and the EU’s Cyber Resilience Act is dragging the rest of the industry the same way. This is becoming a checkbox on contracts, not just a best practice.

The crucial word is transitive. Your package.json lists maybe forty direct dependencies. The actual dependency graph is closer to nine hundred packages, and the thing that bites you is almost always five layers deep, pulled in by a library you’ve never heard of because a library you did choose depended on it. An SBOM flattens that graph into something you can grep. This is exactly why “we only use trusted, well-known packages” is no defence — you don’t get to vet the transitive tail, and that tail is where the incidents live. The same principle drives my argument that every side project needs a backup plan: the failures that hurt are the ones you didn’t know were in scope.

Generating one

You don’t write an SBOM by hand. You generate it from a build artefact. Syft (from Anchore) is the tool I reach for because it understands containers, filesystems, and most language ecosystems without configuration:

1
2
3
4
5
# Inspect a built image and emit CycloneDX JSON
syft myapp:1.4.0 -o cyclonedx-json > sbom.json

# Or scan a project directory directly
syft dir:. -o spdx-json > sbom.spdx.json

Both invocations run in seconds and need no prior configuration for the common cases. The output is verbose, but the shape underneath it is simple — an array of components, each with a purl (package URL) that uniquely identifies it, precisely and unambiguously:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "components": [
    {
      "type": "library",
      "name": "log4j-core",
      "version": "2.14.1",
      "purl": "pkg:maven/org.apache.logging.log4j/[email protected]",
      "licenses": [{ "license": { "id": "Apache-2.0" } }]
    }
  ]
}

That purl is the magic. It’s a standardised identifier — pkg:type/namespace/name@version — that vulnerability databases key off, which means an SBOM and a scanner together can answer “are we affected?” in seconds rather than by manually spelunking through lockfiles.

A note on when you generate it. There’s a real difference between a source SBOM (built from your manifests and lockfiles) and a build SBOM (built from the finished artefact). The source view knows what you declared; the build view knows what actually shipped, including things baked into the base image that never appeared in any manifest you wrote. For security purposes the build SBOM is the honest one — scan the thing that runs, not the thing you intended to run. Syft’s ability to inspect a built container image directly is precisely why I default to it over parsing lockfiles by hand.

Closing the loop with vulnerability scanning

An SBOM on its own is just a list. The value comes from feeding it to something that knows about CVEs. Grype pairs naturally with Syft and reads the SBOM directly, so you scan the inventory rather than re-scanning the artefact:

1
2
# Scan the SBOM we already generated
grype sbom:sbom.json --fail-on high

1
2
NAME        INSTALLED  FIXED-IN  TYPE  VULNERABILITY   SEVERITY
log4j-core  2.14.1     2.17.1    java  CVE-2021-44228  Critical

The --fail-on high flag is what makes this useful in CI: the pipeline goes red when a high-or-worse vulnerability appears in something you actually ship. Generate the SBOM at build time, store it as an artefact alongside the image, and you’ve got a permanent record of what each release contained. When a new CVE drops months later, you scan the stored SBOMs — no need to rebuild ancient images.

Wiring it into CI

The pattern that has served me best: generate, store, scan, gate. Here’s the gist in a GitHub Actions step, though the same three commands work anywhere:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
- name: Generate and scan SBOM
  run: |
    syft "ghcr.io/acme/api:${GITHUB_SHA}" \
      -o cyclonedx-json > sbom.json
    grype sbom:sbom.json --fail-on high
- name: Attach SBOM to release
  uses: actions/upload-artifact@v4
  with:
    name: sbom-${{ github.sha }}
    path: sbom.json

If you sign images with cosign, attach the SBOM as an attestation so it travels with the artefact and can be verified by whoever pulls it (cosign attest --predicate sbom.json --type cyclonedx ...). That turns the SBOM from a file on a CI runner into a provable claim about your release, cryptographically bound to the exact image digest rather than a mutable tag.

The xz wake-up call

For anyone who thought “we only pull from well-known, trusted repos” was a real defence, March 2024 settled the argument. A backdoor was discovered in xz-utils, a compression library so ordinary it ships in essentially every Linux distribution and is pulled in, transitively, by OpenSSH on many systems through liblzma. It had been introduced deliberately over roughly two years by a contributor who had earned commit access through patient, ordinary-looking open-source work, and was caught days before it would have shipped broadly in stable distributions, by a Microsoft engineer named Andres Freund who noticed SSH logins were a few hundred milliseconds slower than they should have been and, rather than shrug it off, investigated. The vulnerability was assigned CVE-2024-3094. Nobody who scanned their own first-party code would have found it; the compromised library sat several layers down in a dependency graph almost nobody had ever mapped. Teams with an up-to-date SBOM for their base images could answer “are we affected?” — precisely which images shipped the compromised liblzma versions 5.6.0 and 5.6.1 — within minutes of the CVE landing, by grepping stored inventories rather than rebuilding and re-auditing everything they had ever shipped. That is the entire pitch for this whole exercise, condensed into one very close call.

Troubleshooting: where this goes wrong in practice

The commands are simple; making the pipeline stay useful rather than becoming ignored noise is the hard part. The failures I see most often:

The pipeline goes red on day one and stays red. A real image has real vulnerabilities the moment you scan it, most of them in the base OS layer and most of them unfixed upstream. If --fail-on high blocks every build immediately, people disable the gate within a week. Fix it by starting with a slim base image (distroless or Alpine, or Chainguard’s images if you want near-zero CVEs), and by scanning and reviewing before you gate.
The same fixed CVE keeps failing. Grype’s database can lag, or the fix is in a version you haven’t rebuilt to. Run grype db update in CI and rebuild against a current base; a stale scan of a stale image tells you nothing.
Findings you genuinely can’t act on. Sometimes the vulnerable component is in the base image and there is no patched version yet. Use an ignore rule with an expiry date and a comment, not a blanket suppression — grype supports a .grype.yaml ignore list. Suppressions without expiries are how a scanner rots into theatre.
The SBOM misses things. A statically linked binary, a vendored dependency copied in without metadata, or a language ecosystem the generator doesn’t fully parse will all leave blind spots. Diff the SBOM’s component count against what you expect for a new project; a suspiciously short list means the generator didn’t understand something.
Version churn drowns the signal. Every base-image bump changes hundreds of components. Store SBOMs keyed by commit SHA so you can diff releases, and only alarm on newly introduced high-severity findings rather than the full list each time.

The honest limitations

SBOMs are inventories, not lie detectors. They tell you what’s present, not whether it’s reachable or exploitable in your particular usage — a vulnerable function you never call still shows up as a finding, which generates noise. That’s what VEX (Vulnerability Exploitability eXchange) documents are meant to address, by recording “yes it’s there, no it’s not exploitable, here’s why,” but VEX tooling is still maturing and the discipline of maintaining it is real work. SBOMs also only capture what the generator can see; a binary blob vendored without metadata is invisible, and dynamically loaded plugins slip through.

The other honest limitation is operational: an SBOM is a snapshot, not a subscription. The moment you generate it, it starts going stale relative to the flow of newly disclosed CVEs. The value comes from re-scanning stored SBOMs on a schedule, not from generating one once and filing it away. A vulnerability disclosed today applies to an image you built six months ago, and only a periodic re-scan of yesterday’s inventory will surface it. If you already run monitoring and alerting for your infrastructure — and you should, for the same reasons I lay out in why your Kubernetes cluster crashes at 2am — treat scheduled SBOM re-scans as one more signal feeding the same on-call awareness. An inventory nobody re-checks is just a document.

Is it worth it?

For anyone shipping software to customers, regulated industries, or government — yes, and increasingly it’s non-negotiable as procurement requirements catch up. If a customer contract or a regulator is going to demand one, you want the pipeline in place before the deadline, not scrambled together the week the auditor arrives.

For a hobby project running on your own homelab, generating an SBOM is largely overkill you’ll never look at — the “are we affected?” question answers itself when the entire org chart is you. That’s the same time-versus-value calculus I apply to most homelab tooling: automation earns its keep only when it saves a question you’d otherwise have to answer manually and under pressure. The sweet spot is any team large enough that “are we affected?” can’t be answered from one person’s memory, or any project whose blast radius extends past its author.

Adding two commands to a pipeline you already have is close to free, and the first time a big CVE lands and you answer in five minutes instead of five days, it pays for itself many times over. Start with Syft and Grype, gate on high severity with a sane base image so the gate is actually passable, store your SBOMs keyed by commit, and ignore VEX until the noise genuinely bothers you. It is one of the highest-leverage, lowest-effort things you can bolt onto a build — the rare security control that costs almost nothing and repays itself the first bad week.

Written by Smarc

Founder and editor of vo.rs. A lifelong tinkerer who self-hosts far more than is sensible, hardens Linux boxes for fun, and prods the latest AI tools to see what they can really do. The how-to guides here are the notes Smarc wishes had existed the first time round.

Tagged#supply chain #security #dependencies

Contents

SBOM: Software Bill of Materials and Why You Should Care About Your Dependencies

Knowing what is actually in your software before someone else tells you

What an SBOM actually is

Generating one

Closing the loop with vulnerability scanning

Wiring it into CI

The xz wake-up call

Troubleshooting: where this goes wrong in practice

The honest limitations

Is it worth it?

Related Content

Reading the Tea Leaves: Hunting Intruders with journalctl and lnav

Locking Out the Bots: Fail2ban and CrowdSec on a Modern Linux Server

Passkeys Explained: Killing the Password for Good

SSH Certificates Beat Authorized Keys