ComfyUI API Tutorial 2026: Automate Image Generation
TL;DR: ComfyUI’s built-in HTTP server accepts workflow JSON on port 8188 — the same JSON the GUI exports when you click “Save (API Format)”. You can queue a prompt, poll for completion, and download the result in under 50 lines of Python. No GUI, no browser tab, no manual clicking.
What you’ll have running after this guide:
- A Python function that submits any ComfyUI workflow and blocks until images are downloaded locally
- A batch script that generates 100 images with automated seed variation and organized output folders
- Working API-format JSON for an SDXL baseline workflow and a Flux.1 Schnell workflow you can adapt immediately
Why drive ComfyUI from the API
The GUI is fine for building and testing workflows. It’s a problem once you need more than a few images, or need to integrate generation into a pipeline.
Common use cases where the API makes sense:
- Batch generation — 50–1000 images with systematic prompt or seed variation
- CI pipelines — trigger image generation on a new product SKU, game asset, or dataset update
- Headless servers — GPU box in a closet or a RunPod cloud instance where running a browser makes no sense
- App backends — a FastAPI endpoint that accepts user prompts and returns generated images
If you’re generating one image at a time and want to tweak nodes visually, stay in the GUI. The API is for automation.
Prerequisites
- ComfyUI v0.23.0 (latest as of June 2026) installed and working in the GUI
- Python 3.10+
pip install requests websocket-client- At least one working workflow in your ComfyUI GUI
If you haven’t installed ComfyUI yet, the ComfyUI review covers installation from scratch. For GPU hardware requirements, the Stable Diffusion 8GB VRAM guide has a good breakdown of what each model tier needs.
Step 1: Start ComfyUI with API access enabled
By default, ComfyUI only listens on 127.0.0.1 — accessible only from the same machine. That’s fine if you’re scripting locally. For a remote GPU box or Docker container, add --listen:
# Local access only (same machine)
python main.py --port 8188
# All interfaces (for remote scripting or Docker)
python main.py --listen 0.0.0.0 --port 8188
# With VRAM optimization for 8–12 GB cards
python main.py --listen 0.0.0.0 --port 8188 --lowvram
Once running, verify it’s up:
curl http://127.0.0.1:8188/system_stats
# {"system": {"os": "posix", "python_version": "3.10.x", ...}}
The server starts a REST API and a WebSocket server on the same port. There is no authentication by default — if you expose this to a network, use a firewall rule or reverse proxy with auth.
Step 2: Export your workflow in API format
The GUI workflow JSON and the API workflow JSON are different formats. The GUI format includes node positions, colors, and UI state. The API format is stripped down to just inputs and connections — which is all the server needs.
To export:
- Open ComfyUI in your browser
- Go to Settings (gear icon) → enable “Dev Mode Options”
- A new button appears in the menu: “Save (API Format)”
- Click it — this downloads
workflow_api.json
The API format uses numeric string keys ("1", "2", "3") as node IDs. Each node has a class_type and an inputs object. Connections between nodes are expressed as ["source_node_id", output_index] arrays rather than named references.
Here is a minimal SDXL baseline workflow in API format:
{
"4": {
"class_type": "CheckpointLoaderSimple",
"inputs": {
"ckpt_name": "sd_xl_base_1.0.safetensors"
}
},
"5": {
"class_type": "EmptyLatentImage",
"inputs": {
"batch_size": 1,
"height": 1024,
"width": 1024
}
},
"6": {
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["4", 1],
"text": "a photorealistic red fox in snow, golden hour lighting"
}
},
"7": {
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["4", 1],
"text": "blurry, low quality, watermark, text"
}
},
"3": {
"class_type": "KSampler",
"inputs": {
"cfg": 7.0,
"denoise": 1.0,
"latent_image": ["5", 0],
"model": ["4", 0],
"negative": ["7", 0],
"positive": ["6", 0],
"sampler_name": "euler",
"scheduler": "normal",
"seed": 42,
"steps": 20
}
},
"8": {
"class_type": "VAEDecode",
"inputs": {
"samples": ["3", 0],
"vae": ["4", 2]
}
},
"9": {
"class_type": "SaveImage",
"inputs": {
"filename_prefix": "api_output",
"images": ["8", 0]
}
}
}
For SDXL specifically, node "4" connects to both CLIP encoders and the KSampler via its three outputs: [0] = model, [1] = CLIP, [2] = VAE. The connection syntax ["4", 1] means “output index 1 of node 4.”
Step 3: Queue a prompt from Python
import uuid
import json
import requests
SERVER = "127.0.0.1:8188"
CLIENT_ID = str(uuid.uuid4())
def queue_prompt(workflow: dict) -> str:
"""Submit a workflow. Returns the prompt_id for tracking."""
payload = {"prompt": workflow, "client_id": CLIENT_ID}
r = requests.post(f"http://{SERVER}/prompt", json=payload)
r.raise_for_status()
return r.json()["prompt_id"]
The client_id is a UUID you generate once per session. It ties your WebSocket connection to your HTTP requests so the server routes status messages back to you. You can skip it for pure HTTP polling, but you need it for WebSocket tracking.
The server responds immediately with {"prompt_id": "<uuid>", "number": <queue_position>}. Generation hasn’t started yet — the prompt is in queue.
Step 4: Poll for completion
Two approaches — choose based on your use case:
| Approach | Latency | Complexity | Best for |
|---|---|---|---|
HTTP polling /history/{id} | ~1s overhead | Low — no extra library | Scripts, batch jobs |
WebSocket /ws?clientId=... | Near-real-time | Medium — event loop | Apps that show progress |
HTTP polling (simpler)
import time
def wait_for_completion(prompt_id: str, poll_secs: float = 1.0) -> dict:
"""Block until the prompt finishes. Returns the history entry."""
url = f"http://{SERVER}/history/{prompt_id}"
while True:
r = requests.get(url)
r.raise_for_status()
history = r.json()
if prompt_id in history:
return history[prompt_id]
time.sleep(poll_secs)
The /history/{prompt_id} endpoint returns an empty dict {} while the prompt is queued or running, and the full result object once done. Polling every second adds at most 1 second of latency to your total generation time — acceptable for batch scripts.
WebSocket approach (for real-time apps)
import websocket
def generate_with_ws(workflow: dict) -> str:
"""Queue and wait for completion via WebSocket. Returns prompt_id."""
ws = websocket.WebSocket()
ws.connect(f"ws://{SERVER}/ws?clientId={CLIENT_ID}")
prompt_id = queue_prompt(workflow)
while True:
raw = ws.recv()
if not isinstance(raw, str):
continue # binary preview frames — skip
msg = json.loads(raw)
if msg["type"] == "executing":
data = msg["data"]
if data["node"] is None and data["prompt_id"] == prompt_id:
break # null node = generation finished
ws.close()
return prompt_id
The executing message fires for each node as it runs. When node is None and the prompt_id matches yours, the entire graph has finished executing.
Step 5: Download the generated images
import os
def download_images(history_entry: dict, output_dir: str = "output") -> list[str]:
"""Download all output images from a completed prompt."""
os.makedirs(output_dir, exist_ok=True)
saved = []
for node_id, node_output in history_entry["outputs"].items():
if "images" not in node_output:
continue
for img in node_output["images"]:
params = {
"filename": img["filename"],
"subfolder": img["subfolder"],
"type": img["type"],
}
r = requests.get(f"http://{SERVER}/view", params=params)
r.raise_for_status()
dest = os.path.join(output_dir, img["filename"])
with open(dest, "wb") as f:
f.write(r.content)
saved.append(dest)
return saved
The /view endpoint serves files from ComfyUI’s output/ directory. filename, subfolder, and type all come directly from the history response — don’t construct them manually.
Step 6: Batch generation with seed variation
Here is the full batch script:
def batch_generate(
workflow_path: str,
count: int = 100,
seed_node_id: str = "3",
output_root: str = "batch_output",
) -> list[str]:
"""Generate `count` images with sequential seeds."""
with open(workflow_path) as f:
workflow = json.load(f)
all_paths = []
for i in range(count):
workflow[seed_node_id]["inputs"]["seed"] = i
run_dir = os.path.join(output_root, f"{i:04d}")
prompt_id = queue_prompt(workflow)
history = wait_for_completion(prompt_id)
paths = download_images(history, output_dir=run_dir)
all_paths.extend(paths)
print(f"[{i + 1:>4}/{count}] seed={i} → {paths}")
return all_paths
if __name__ == "__main__":
batch_generate("workflow_api.json", count=100)
Expected output for 20-step SDXL on an RTX 4090:
[ 1/100] seed=0 → ['batch_output/0000/api_output_00001_.png']
[ 2/100] seed=1 → ['batch_output/0001/api_output_00001_.png']
...
[ 100/100] seed=99 → ['batch_output/0099/api_output_00001_.png']
A typical 20-step SDXL generation at 1024×1024 takes 3–5 seconds on an RTX 4090, so 100 images runs in roughly 5–8 minutes. On an RTX 3090, expect 7–12 seconds per image, so 100 images is 12–20 minutes. For overnight batch jobs where GPU cost matters, cloud GPU rental via RunPod runs an A100 at well under $1/hour.
Flux.1 Schnell workflow for the API
Flux.1 Schnell (Apache 2.0 licensed, 1–4 step generation) uses a different node structure from SDXL. The key differences in API format:
{
"1": {
"class_type": "UNETLoader",
"inputs": {
"model_name": "flux1-schnell.safetensors",
"weight_dtype": "fp8_e4m3fn"
}
},
"2": {
"class_type": "DualCLIPLoader",
"inputs": {
"clip_name1": "t5xxl_fp8_e4m3fn.safetensors",
"clip_name2": "clip_l.safetensors",
"type": "flux"
}
},
"3": {
"class_type": "VAELoader",
"inputs": { "vae_name": "ae.safetensors" }
},
"6": {
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["2", 0],
"text": "a photorealistic red fox in snow, golden hour lighting"
}
},
"5": {
"class_type": "KSampler",
"inputs": {
"cfg": 1.0,
"denoise": 1.0,
"latent_image": ["4", 0],
"model": ["1", 0],
"negative": ["7", 0],
"positive": ["6", 0],
"sampler_name": "euler",
"scheduler": "simple",
"seed": 42,
"steps": 4
}
}
}
Critical Flux.1 differences vs SDXL:
cfgmust be1.0— higher values break the outputstepsis 1–4 for Schnell (it’s a distilled model); use 20–28 for Flux.1 Dev- No negative prompt is used; Flux ignores it — leave it as an empty CLIPTextEncode
- Flux.1 Dev requires 16GB+ VRAM for full precision;
fp8_e4m3fnbrings it to ~10GB - Flux.1 Schnell is Apache 2.0; Flux.1 Dev is non-commercial
See the Flux vs SDXL vs SD 3.5 comparison for a detailed breakdown of which model fits which hardware.
Common errors and fixes
{"error": "prompt invalid"} — Your workflow JSON references a node that doesn’t exist, or a connection points to a non-existent output index. Export a fresh API-format JSON from the GUI rather than hand-editing connections.
Images not appearing in /history — The SaveImage node wrote the file but the history endpoint shows empty outputs. Check that your SaveImage node is actually connected to the VAE decode output — a disconnected node produces no output entry.
requests.exceptions.ConnectionError — ComfyUI isn’t running, or started without --listen on a remote machine. Verify with curl http://<host>:8188/system_stats.
Generation starts but hangs — Another prompt is ahead of yours in the queue. Check /queue to see what’s running: GET http://127.0.0.1:8188/queue. You can also check /prompt (GET, no body) for queue depth.
When NOT to use the ComfyUI API
The API is not the right tool when:
- You’re still building the workflow — Visual node connections and live preview are why the GUI exists. Debug in the GUI, then export.
- You need dynamic node graph changes — The API sends a static workflow JSON per request. If you need to conditionally swap entire subgraphs based on runtime input, you’ll do this by building different JSON objects in Python before posting — which is manageable but adds complexity.
- You want a managed API — If you don’t want to run the server yourself, services like RunComfy and RunDiffusion wrap ComfyUI with auth, scaling, and a cleaner REST interface. The tradeoff is cost and control.
- Your workflow changes constantly — If you’re adjusting nodes hourly, re-exporting the API JSON each time gets annoying. Consider ComfyUI custom nodes that expose parameters directly, reducing how often you need to re-export.
FAQ
Does ComfyUI’s API support uploading images for img2img?
Yes — POST /upload/image accepts a multipart file upload and returns {"name": "uploaded_filename.png", "subfolder": "", "type": "input"}. Pass the returned name as the image input on a LoadImage node in your workflow JSON.
Can I run multiple ComfyUI instances and load-balance across them?
Yes. Run separate instances on different ports (e.g., 8188 and 8189) and round-robin your SERVER variable in Python. There’s no built-in cluster mode, but since each instance is stateless per-prompt, a simple round-robin or queue-length check on /queue gives you basic load balancing.
What happens if ComfyUI crashes mid-generation?
The /history endpoint will never return a result for that prompt_id. Your wait_for_completion function will poll forever. Add a timeout — wrap the polling loop with a time.time() check and raise after N minutes.
How do I change the output image resolution without re-exporting?
Modify the EmptyLatentImage node inputs in the loaded JSON before posting: workflow["5"]["inputs"]["width"] = 1920; workflow["5"]["inputs"]["height"] = 1080. The node IDs come from your exported JSON — check which node has "class_type": "EmptyLatentImage".
Is there an OpenAPI/Swagger spec for the ComfyUI API?
Not an official one as of v0.23.0. The closest is the official routes documentation at docs.comfy.org and community-maintained references like the Runflow API endpoints guide. The surface is stable enough that community docs are reliable.
Sources
- ComfyUI GitHub — Comfy-Org/ComfyUI v0.23.0
- ComfyUI basic_api_example.py — official script example
- ComfyUI WebSocket API Overview — docs.comfy.org via Mintlify
- 9elements comfyui-api — open-source Python wrapper with workflow examples
- ComfyUI API Endpoints: The Complete 2026 Reference — Runflow
- How to Use ComfyUI API with Python — DEV Community
Recommended Gear
Was this article helpful?
Thanks for the feedback — it helps improve future articles.
Need hands-on help?
I offer 1-on-1 technical consulting for local AI setup, GPU selection, and AI coding tool configuration — same topics covered on this site.
Book a session — $49 / hour →