Stable Diffusion Workflows: Turning ComfyUI into an Image API
Stop clicking the canvas and start POSTing JSON

ComfyUI is usually sold as a node editor — a sprawling graph of boxes and wires you drag around to build a Stable Diffusion pipeline. That’s how most people meet it, and it’s genuinely the most flexible front end for local image generation. But the canvas is the boring part. The interesting part is that everything you build on it is just a JSON document, and ComfyUI happily executes that JSON over an HTTP API. Once you realise that, ComfyUI stops being a toy you click and becomes an image-generation service you can call from anything — a cron job, a build pipeline, a webhook handler.
This post assumes you already know what ComfyUI is and have a workflow that produces images you like. The goal here is to stop touching the mouse and drive the whole thing headlessly.
1 The two JSON formats
This trips everyone up once, so let’s get it out of the way. When you save a workflow from the UI, you get the UI format — it includes node positions, link metadata, and other editor cruft. The API does not want that. It wants the API format, which is a flat map of node IDs to their class and inputs.
To get it: in the ComfyUI settings, enable “Dev mode,” then use Save (API Format). You’ll get something like this (trimmed):
{
"3": {
"class_type": "KSampler",
"inputs": {
"seed": 42,
"steps": 25,
"cfg": 7.0,
"sampler_name": "euler",
"model": ["4", 0],
"positive": ["6", 0],
"negative": ["7", 0],
"latent_image": ["5", 0]
}
},
"6": {
"class_type": "CLIPTextEncode",
"inputs": { "text": "a lighthouse at dusk", "clip": ["4", 1] }
}
}The ["4", 0] notation is a wire: “take output slot 0 of node 4.” That’s the entire graph, expressed as data. Anything you can edit in the UI — the prompt, the seed, the steps — is a field you can overwrite from code before you submit.
2 Submitting a job
Generation is asynchronous. You POST the workflow to /prompt and get back a prompt_id; the actual render happens on the GPU queue. Here’s the minimal Python loop — load the template, patch the prompt and seed, queue it, then poll history for the result:
import json, random, time, urllib.request
HOST = "http://127.0.0.1:8188"
def queue(workflow):
data = json.dumps({"prompt": workflow}).encode()
req = urllib.request.Request(f"{HOST}/prompt", data=data)
return json.load(urllib.request.urlopen(req))["prompt_id"]
with open("workflow_api.json") as f:
wf = json.load(f)
# Override the parts we care about
wf["6"]["inputs"]["text"] = "a foggy harbour, cinematic, golden hour"
wf["3"]["inputs"]["seed"] = random.randint(0, 2**32)
pid = queue(wf)
# Poll until the job shows up in history
while True:
hist = json.load(urllib.request.urlopen(f"{HOST}/history/{pid}"))
if pid in hist:
outputs = hist[pid]["outputs"]
break
time.sleep(1)
# Fetch the rendered image
img = outputs["9"]["images"][0]
url = f"{HOST}/view?filename={img['filename']}&subfolder={img['subfolder']}&type={img['type']}"
urllib.request.urlretrieve(url, "out.png")
print("saved out.png")That’s the whole API surface you need for batch work: /prompt to queue, /history/{id} to collect results, /view to download. No node editor in sight.
3 Going properly headless
For a server you’ll want ComfyUI running without a display, listening on something other than localhost:
python main.py \
--listen 0.0.0.0 \
--port 8188 \
--output-directory /srv/comfy/out \
--disable-auto-launchPut it behind a reverse proxy with authentication — the API has no auth of its own, so anything that can reach the port can run jobs on your GPU. For real-time progress rather than polling, there’s a WebSocket at /ws that streams execution events and even preview images mid-render, but for batch generation the poll-the-history pattern above is simpler and perfectly adequate.
The pattern that scales well: keep a library of API-format workflow templates as files, treat each as a function whose “arguments” are the handful of input fields you override, and wrap the queue-and-poll loop in a small service. Now “generate a hero image for this article” is one HTTP call from your publishing pipeline.
4 Is it worth it?
If you generate images one at a time for fun, stay on the canvas — the API buys you nothing. The moment you find yourself doing the same workflow repeatedly with different prompts, or wanting images produced by some other system, flipping to the API is transformative. The investment is small: enable dev mode, save the API-format JSON, and learn the three endpoints above. The catch worth flagging is that the API format is tied to your exact node graph, so when you redesign a workflow the node IDs your code references can change — keep the template and the calling code together, and re-export both at once. For anyone running ComfyUI on a homelab GPU and wanting it to do useful work unattended, this is the unlock.




