DevLogs
n8n Infrastructure Automation

The CPU Wall: When One Server Isn't Enough

Posted on

Video and audio workflows hitting 100% CPU on a Contabo server, bringing everything else down with them. Why adding more n8n instances looked like scaling but was actually just copying the problem.

The workflow was a video generation pipeline. On my local machine it ran fine — heavy, but fine. On the Contabo server, it consumed 100% CPU for the duration and made every other workflow that tried to run alongside it crawl or fail outright.

This wasn’t a bug. n8n on a single instance runs everything on the same Node.js process. Workflow triggers, webhook handling, API calls, media processing — all competing for the same resources. When a CPU-intensive workflow fires, it takes what it needs, and whatever else is running pays for it.

The local machine hid this. Better hardware, more cores, more headroom. The Contabo server had lower CPU performance by comparison, and under real workload conditions the difference became visible fast: execution times that took seconds locally took minutes on the server, and under concurrent load they started failing.

The First “Solution”

The instinct was straightforward: if one server is struggling, add another server. Run a second n8n instance, distribute workflows across both. More capacity.

What that actually meant in practice: set up a new server from scratch, install n8n, configure all the credentials again, duplicate every workflow that needed to run on the new instance, and debug whatever broke in the process. Each new server was several hours of setup. Credentials had to be re-entered manually — n8n doesn’t export credentials in a way you can just import elsewhere. Workflows had to be re-created or manually exported and imported and tested.

And when something needed to change — a workflow updated, a credential rotated, a new community node installed — every change had to happen on every server. Separately. Manually. The more servers added, the more places there were for things to be out of sync.

This isn’t scaling. It’s copying the problem. Each server was running a full n8n instance with everything duplicated — and each one was still subject to the same single-process bottleneck. Adding a third server would mean three sets of the same setup, three places to maintain, and still the same problem on each one individually if a heavy workflow fired.

The operational overhead wasn’t the only issue. The architecture was wrong at a level that more servers couldn’t fix. What was needed wasn’t more instances running all workflows — it was separation between the thing that manages workflows and the things that execute them.

That’s what led to looking at queue mode.