Label Studio: Self-Hosted Data Annotation for Training Your Own Models
The unglamorous tool that decides whether your model is any good

There’s a comforting lie in machine learning circles that the model is the hard part. It isn’t. The model is the bit with the nice papers and the GitHub stars. The hard part — the part that determines whether your classifier works or quietly humiliates you in production — is the labels. Garbage labels, garbage model, no exceptions. And labelling is tedious, error-prone, and almost always done in some horror of a spreadsheet that loses your work when the browser crashes.
Label Studio is the open-source antidote. It’s a web app for annotating data — text, images, audio, video, time series, the lot — built by people who clearly suffered through bad labelling tools first. The community edition is free, self-hostable, and good enough that I’ve never reached for the paid tier. If you’re training your own models on your own data, this is the workbench you’ve been missing.
1 Standing it up
It’s a single Docker container with a Postgres database behind it. For a serious project you’ll want the database external so your labels survive a container rebuild, but the all-in-one image is fine for a first look.
services:
label-studio:
image: heartexlabs/label-studio:latest
ports:
- "8080:8080"
environment:
- [email protected]
- LABEL_STUDIO_PASSWORD=change-this-now
- DJANGO_DB=default
- POSTGRE_NAME=labelstudio
- POSTGRE_HOST=db
volumes:
- ./ls-data:/label-studio/data
depends_on:
- db
db:
image: postgres:16
environment:
- POSTGRES_DB=labelstudio
- POSTGRES_USER=labelstudio
- POSTGRES_PASSWORD=change-this-too
volumes:
- ./pg-data:/var/lib/postgresql/dataBring it up, log in, and you’re looking at the project list. Each project gets a labelling config — a small XML dialect that defines what annotators see and what they produce. This is the clever bit: the same tool does sentiment tagging, bounding boxes, and named-entity recognition, just with a different config.
2 The labelling config
Here’s a config for named-entity recognition over text — the sort of thing you’d use to train a model to pull names and organisations out of documents:
<View>
<Labels name="label" toName="text">
<Label value="PERSON" background="#FFA39E"/>
<Label value="ORG" background="#D4380D"/>
<Label value="LOCATION" background="#FFC069"/>
</Labels>
<Text name="text" value="$text"/>
</View>Annotators drag-select a span, click a label, and Label Studio records the character offsets. Import a JSONL file where each line has a text field, and the tasks populate automatically. The exported annotations come back in a clean JSON shape with the spans, labels, and offsets — exactly what a tokeniser-based training pipeline wants.
3 Pre-labelling, which is the actual point
Manual labelling from scratch is soul-destroying at scale. The feature that makes Label Studio worth the setup is the ML backend: you connect a model that pre-fills predictions, and your annotators correct rather than create. Correcting a label takes a fraction of the time of producing one, and the quality goes up because people are reviewing rather than grinding.
You point the project at an SDK-based backend, and Label Studio sends each task to it and displays the returned predictions as draft annotations:
from label_studio_ml.model import LabelStudioMLBase
class NERBackend(LabelStudioMLBase):
def predict(self, tasks, **kwargs):
results = []
for task in tasks:
spans = run_my_model(task["data"]["text"])
results.append({"result": spans, "score": 0.9})
return resultsEven a mediocre first model is useful here — it gets the obvious cases right, your humans fix the rest, and the corrected data trains a better model that pre-labels the next batch. That feedback loop is the whole game in applied ML, and Label Studio makes it concrete.
4 The friction
It’s not all tidy. The labelling config XML is poorly documented and you’ll spend an afternoon discovering which control pairs with which object tag. Multi-annotator workflows — where you want several people labelling the same item to measure agreement — exist but the consensus tooling is thin in the community edition; a lot of the genuinely nice review and analytics features are paywalled. Performance on very large image projects can crawl if you don’t put the media behind proper cloud storage rather than local files. And the permissions model is basic, so if you’re running it for an external team, mind what you expose.
5 The verdict
If you are doing any serious supervised learning on your own data, you need an annotation tool, and a spreadsheet is not it. Label Studio is the best self-hostable option I’ve found — flexible across data types, free for the core workflow, and built around the pre-label-then-correct loop that actually scales. It’s overkill if you’re labelling fifty examples for a toy project; just use a script. But the moment you’re labelling thousands of items, or doing it with more than one person, the setup pays for itself within a day. It’s the unglamorous infrastructure that quietly decides whether your model is good, and it deserves more love than it gets.



