
Running n8n in production feels great until webhook requests start queuing and execution times balloon. The fix isn’t always obvious. Should you throw more power at your single server or distribute the load?
This guide walks you through both scaling strategies with real numbers, practical limits, and honest trade-offs for your automation hosting.
Scaling n8n efficiently requires infrastructure that supports both vertical upgrades and distributed setups. The comparison table below highlights VPS hosting providers that offer flexible scaling options without disrupting performance. These providers make it easier to adapt your setup as workflow demands increase. Explore our recommended VPS hosting options.
VPS Hosting Providers That Support Flexible Horizontal and Vertical Scaling
| Provider | User Rating | Recommended For | |
|---|---|---|---|
![]() | 4.8 | Scalability | Visit Kamatera |
![]() | 4.6 | Affordability | Visit Hostinger |
![]() | 4.7 | Developers | Visit IONOS |
Introduction to Scaling Strategies for n8n Automation
Understanding Horizontal and Vertical Scaling
Expanding system capacity is critical when deploying n8n for workflow automation. Get this wrong, and you’ll face frustrated users, failed automations, and midnight emergency calls. Get it right, and your infrastructure grows smoothly alongside your business.
Vertical scaling, often called “scaling up,” involves upgrading existing server resources. You add more CPU cores, RAM, and storage without changing your system footprint. Your workflows keep running exactly as before, just faster and with more headroom.
Horizontal scaling works differently. Instead of powering up one machine, you add more servers to distribute the workload. This approach offers built-in redundancy and massive scalability potential. When one server fails, others keep processing.
Here’s the catch: n8n operates as a single Node.js process by default. This architectural reality directly affects which scaling strategies work for your situation. Understanding both vertical and horizontal scaling helps you make informed decisions about your automation infrastructure.
Vertical Scaling: Powering Up Your Single Server
1. The Architecture of Vertical Scaling
Vertical scaling operates by increasing vCPU, RAM, and storage on a single machine. You don’t modify n8n’s architecture or touch workflow configurations. Everything stays centralized within one Node.js main process.
All workflow executions, webhook handling, and UI serving happen in one place. This simplicity is actually a feature, not a bug. For teams with limited DevOps expertise, a single server setup provides reliable automation hosting without complexity overhead.
When you’re just getting started or processing moderate workloads, vertical scaling makes perfect sense. You focus on building great automations instead of managing infrastructure.
2. Core Advantages of a Single-Server Setup

Simplicity drives the single-server approach. Updates happen in one place. Database backups run from one location. Security patching covers one system. No coordination required.
The cost efficiency is remarkable when paired with reliable VPS hosting. Real-world numbers tell the story clearly.
One practitioner successfully ran 76,000 execution runs weekly on a VPS costing under $30 monthly. That translates to approximately $0.00009868 per execution. Try getting that rate from any managed automation platform.
For teams processing data at moderate volumes, vertical scaling delivers excellent value without sacrificing performance or reliability.
3. The Main Process and Node.js Event Loop Limits
Node.js applications are fundamentally single-threaded for synchronous operations. This matters more than most people realize.
Even on a 128-core machine with 1TB of RAM, a single n8n instance utilizes only one core for synchronous code execution. All those extra CPU cores? They sit idle during heavy computation.
Extensive synchronous computation in a Code node will block the event loop. When this happens, webhook requests queue up. Response times degrade. Users notice.
This isn’t a flaw in n8n. It’s how Node.js works. Understanding this limitation helps you plan for growth and avoid performance bottlenecks before they impact your production environments.
4. Hardware Ceilings and Concurrency Bottlenecks
Server hardware has absolute physical limits. There’s only so much CPU and memory usage you can pack into a single machine, regardless of budget.
Single-instance setups typically hit a performance ceiling between 5,000 and 10,000 daily executions. Beyond this point, degradation becomes noticeable. Workflow execution times increase. Webhooks respond slower.
The bigger risk? A single point of failure. If your one powerful server crashes, all organizational automations halt immediately. No fallback. No redundancy. Just downtime.
Upgrading typically requires restarting the application too. That introduces temporary downtime even during planned maintenance. For mission-critical automations, this limitation weighs heavily.
Horizontal Scaling: Expanding Across Multiple Machines
1. Unlocking Throughput with Queue Mode
Horizontal scaling in n8n is an intentional architectural choice. You enable it through queue mode, which fundamentally restructures how n8n processes work.
Queue mode decouples webhook reception from workflow execution. The main instance validates incoming requests, creates execution records, and passes them to a queue. It stays highly responsive to new webhooks because it’s not doing the heavy lifting.
This separation is powerful. Your main process handles traffic while dedicated worker processes crunch through the actual automation logic. Scaling horizontally becomes possible once this foundation is in place.
2. The Role of Redis and Worker Nodes

Queue mode operates as a multi-component distributed system. Redis serves as the message broker and job queue, coordinating work across multiple nodes.
Multiple worker instances run as independent Node.js worker processes. They pull jobs from Redis at their own pace. If one worker gets overloaded, others pick up the slack.
This architecture allows true multi-core and multi-machine utilization. You overcome the single-threaded limitations that constrain vertical scaling.
Teams can dedicate specific worker pools to high-priority automations. Heavy CPU-bound workflows get their own resources. Lightweight tasks run on separate individual nodes. This specialization maximizes resource utilization across your infrastructure.
3. Achieving Zero-Downtime Updates and Redundancy
Horizontal scaling inherently provides fault tolerance. A complete outage requires multiple simultaneous component failures. That’s far less likely than a single server crash.
Rolling updates become possible too. Workers restart one at a time while the system continues processing. Users never notice. Zero downtime updates transform maintenance from a scheduled event into a routine background task.
For ultra-low latency requirements, deploy dedicated webhook processing instances. They handle incoming HTTP requests without competing for resources with heavy workflow executions.
4. Managing Operational Complexity in Distributed Systems
Horizontal scaling introduces significant complexity. There’s no sugarcoating this reality.
You need advanced DevOps expertise including Docker, Kubernetes, and Redis administration. Managing multiple servers simultaneously requires coordination skills. Load balancing adds another layer to configure and monitor.
Redis becomes a critical dependency. If your Redis instance fails, the entire queue system stops functioning. You’ve traded one single point of failure for another, though Redis clusters can mitigate this risk.
The database layer gets more complex too. PostgreSQL, multiple worker connections, and connection pooling all need careful configuration details to work smoothly together.
Head-to-Head Comparison: Horizontal vs Vertical Scaling for n8n
Operational and Performance Characteristics Table
Understanding exact operational differences helps you select the right scaling strategies for your business. Below is a comprehensive comparison based on real-world capabilities:
| Dimension | Vertical Scaling | Horizontal Scaling (Queue Mode) |
|---|---|---|
| Architecture Complexity | Single process, minimal moving parts | Distributed system with Redis, workers, load balancer |
| Maximum Concurrent Executions | 5-15 without queue mode | 50-1,000+ (limited only by infrastructure) |
| Daily Execution Volume | 5,000-100,000 | 100,000-400,000+ |
| Response Time (Webhook) | Single digit ms until capacity limit | Consistent low latency across capacity range |
| Single Point of Failure | Yes (one server failure = outage) | No (multiple component redundancy) |
| Zero-Downtime Updates | Not easily achievable | Yes (rolling worker updates) |
| Database Load | Moderate, one connection pool | High, multiple connection pools |
Database Performance and Infrastructure Constraints
1. SQLite Limits vs PostgreSQL Capabilities

Default single-instance n8n uses SQLite. While perfect for prototyping, SQLite enforces file-level locking. Only one process can write at a time.
This limitation kills horizontal scaling immediately. Multiple worker processes cannot share a SQLite database effectively. Writes queue up, creating bottlenecks that defeat the purpose of distributing workload.
PostgreSQL resolves these limitations through row-level locking and sophisticated concurrent query optimization. Database performance improves dramatically under load. Multiple workers read and write simultaneously without blocking each other.
For any serious production deployment, PostgreSQL is the baseline requirement. It handles data consistency challenges that SQLite simply cannot manage at scale.
2. Connection Pooling and Data Consistency Challenges
A 50-worker setup with concurrency of 10 requires potentially 500 simultaneous database connections. That’s severe connection pressure on any database server.
Multiple worker processes accessing the database simultaneously creates lock contention. Without careful planning, you’ll see timeouts and failed executions during demand surges.
Enable connection pooling through tools like PgBouncer. Establish dedicated database servers separate from worker machines. Consider database replication for read-heavy workloads. These best practices prevent database bottlenecks from limiting your scaling headroom.
3. Managing Execution History and Storage Requirements
A single-instance deployment might need only 512MB to 1GB for its database. Heavy queue mode deployments often require 10GB+ of storage due to massive execution history accumulation.
Processing more data means storing more data. Workflow data from millions of executions adds up quickly. Storage costs and query performance both suffer without intervention.
n8n enables execution pruning by default. Executions older than 14 days or exceeding 10,000 total get removed automatically. This maintains query speed and prevents unbounded storage growth. For large datasets, adjust pruning settings based on your compliance and debugging requirements.
Cost Management and Scaling Economics
1. The Cost Curve of Vertical Scaling

Vertical scaling follows a predictable, affordable cost curve for workloads under 100,000 monthly executions.
Typical VPS pricing tiers for single instances break down simply. Small workloads run fine on $6-10 monthly plans. Moderate workloads need $20-40 monthly. Premium single-instance workloads push toward $100-200 monthly before hitting hard capacity limits.
For budget-conscious teams, cheap n8n hosting delivers remarkable value. That practitioner running 76,000 weekly executions for under $30? That’s not exceptional. It’s achievable with proper planning.
2. Infrastructure Costs for Queue Mode Deployments
Queue mode introduces infrastructure layering costs. Expect a 5-10x cost multiplier compared to a single small VPS.
Base queue mode costs include Redis instances at $5-15 monthly. PostgreSQL databases run $20-40 monthly. Your main instance needs $10-15 monthly. Each worker instance adds $10-20 monthly.
Total minimum infrastructure cost ranges from $45-90 monthly for basic setups. Enterprise workloads with 50 workers scale to $500-1,000+ monthly.
Factor in engineering overhead too. Medium setups require 10+ hours monthly for maintenance. At typical DevOps salaries, that represents $2,000-10,000 in hidden costs. Compare this against managed alternatives before deciding.
3. Self-Hosted vs Cloud Pricing Models
Cloud-based n8n handles all infrastructure management. However, pricing becomes expensive at scale.
Managed cloud pricing starts around €20 monthly for 2,500 executions. Pro plans cost €50 monthly for 25,000 executions. Business plans jump to €667 monthly (approximately $720-800) for 50,000 executions.
Self-hosted queue mode achieves €0.001-0.003 per execution at scale. That compares favorably to €0.0133 per execution on managed Business plans.
When comparing options, n8n versus Zapier analysis shows even greater savings potential for self-hosted deployments. Cloud services offer convenience while self-hosting offers economics.

Performance Tuning and Benchmarking
1. Single Instance Execution Limits
Official benchmarks show a single instance in main mode can handle up to 220 workflow executions per second. That’s approximately 19 million daily under ideal conditions.
These benchmarks rely on simple test workflows and well-resourced AWS instances. Real-world performance varies significantly based on workflow design complexity.
A workflow with 50 API calls and complex JavaScript may take 10+ seconds. That allows only 0.1-0.2 concurrent executions per second from that workflow. Performance tuning starts with understanding your actual workload characteristics, not theoretical maximums.
2. Multi-Instance Queue Mode Throughput
Testing with seven EC2 c5a.4xlarge instances achieved sustained throughput of hundreds of requests per second. Each instance ran 8GB RAM with dedicated resources.
Response times stayed consistently sub-100ms. This dramatically exceeded single-instance capabilities while maintaining high availability and fault tolerance.
Limiting factors shift in horizontal setups. CPU and memory constraints give way to database connection pool exhaustion and Redis throughput limits. Different bottlenecks require different solutions.
3. Optimizing for Workflow Complexity
Lightweight, data-passing webhook workflows sustain 1,000+ concurrent executions on properly configured horizontal setups. Complex workflows tell a different story.
Heavy CPU-bound workflows processing large datasets may sustain only 10-50 concurrent executions across multiple machines. Batch processing operations strain memory. Complex transformations consume CPU cycles.
Match your scaling strategies to specific workflow characteristics. Implementing caching strategies helps reduce redundant API calls and database queries. Profile your complex workflows to identify bottlenecks before scaling.
Our upcoming guide on performance tuning n8n dives deeper into optimization techniques for heavy workloads.
Best Practices for Monitoring and Reliability
1. Tracking Key Metrics for System Health
Horizontal setups require sophisticated embedding monitoring to ensure distributed components stay healthy. Without visibility, problems compound before you notice them.
Critical key metrics to track include Redis queue depth. Unbounded growth indicates workers cannot keep pace. Monitor worker success and failure rates. Track individual execution duration percentiles. Watch PostgreSQL connection pool saturation and query latency.
These metrics provide valuable insights into system health. They help you identify bottlenecks before users experience degradation.
2. Essential Observability Tools

Maintaining long-term reliability requires a dedicated observability stack. Several monitoring tools work well together for n8n deployments.
Prometheus and Grafana handle comprehensive metrics collection and visualization. They’re industry standards for good reason. Uptime Kuma provides dedicated service health alerting with minimal configuration.
RedisInsight offers queue inspection during active debugging sessions. When something breaks, you need to see inside the queue quickly.
For enterprise deployments, consider an ELK stack for centralized logging across multiple nodes. This helps track workflow data flow and troubleshoot failures across distributed systems.
Monitoring infrastructure requires 2-3 hours weekly for medium setups with 5-10 workers. Large deployments need 5+ hours weekly. Budget this time into your operational planning.
Setting Up Your VPS Infrastructure
Before implementing any scaling strategy, you need solid hosting infrastructure. A well-configured VPS provides the foundation for both vertical and horizontal approaches.
When setting up a VPS for n8n, consider resource usage patterns specific to automation workloads. Memory usage spikes during complex workflows. CPU cores matter for parallel processing in queue mode. Storage requirements grow with execution history.
Choosing the right VPS provider impacts everything from network latency to scaling flexibility. Look for providers offering easy resource upgrades for vertical scaling and multiple instance management for horizontal approaches.
Data integrity and sensitive data protection matter too. Ensure your provider offers appropriate security controls. Workflow configurations often contain API keys and credentials that require careful handling in production environments.
Licensing and Enterprise Considerations
1. Community Edition vs Commercial Licensing
The Community Edition supports both vertical scaling and horizontal scaling with unlimited workflows, executions, and users. It’s genuinely free for self-hosting.
Embedding n8n inside a SaaS product or reselling workflows requires an Embed License under the Sustainable Use License model. Making automation available to your customers changes the licensing equation.
Advanced features like SSO, log streaming, and advanced RBAC are reserved for paid tiers. Evaluate whether these capabilities justify the cost for your organization.
2. Multi-Main Setup for Enterprise High Availability
Multi-main setups represent advanced horizontal scaling patterns. Multiple main instances run concurrently behind a load balancer, providing high availability for both the UI and webhook processing layers.
All main processes must run identical n8n versions. Session persistence with sticky sessions ensures consistent user experience. Load balancing distributes traffic across instances.
This highly resilient configuration is restricted to Enterprise tier offerings. It requires commercial licensing but delivers maximum uptime for organizations where automation downtime means business losses.

Choosing Your Scaling Path
1. When to Choose Vertical Scaling
Vertical scaling fits organizations processing fewer than 50,000-100,000 monthly executions. It’s ideal for teams lacking advanced DevOps expertise who expect single-digit percent monthly growth.
A single well-configured VPS with PostgreSQL offers superior reliability over a poorly implemented horizontal setup. Complexity without expertise creates fragility.
When your workflows grow at predictable rates, vertical scaling lets you focus on automation logic instead of infrastructure management.
2. When to Transition to Horizontal Scaling
Horizontal scaling becomes necessary when workloads exceed 200,000-300,000 monthly executions. At this volume, vertical approaches hit hard limits that more resources cannot solve.
High availability requirements across multiple availability zones demand horizontal architecture. Consistent execution times regardless of concurrent load spikes require distributed processing.
If more users depend on your automations daily, the single point of failure risk becomes unacceptable. Scale effectively by distributing risk across multiple systems.
3. The Hybrid Approach for Growing Teams
Start vertically to minimize operational complexity during early phases. This lets you prove value before investing in infrastructure sophistication.
Scale vertically up to approximately 100,000 monthly executions using robust hosting. Many organizations never need more than this.
Introduce queue mode incrementally as volumes grow. Add one worker first. Learn the operational patterns. Build team expertise gradually without a sudden cliff transition.
This hybrid approach balances cost management with growth readiness. You avoid overengineering while preparing for scale.
Conclusion
Choosing between horizontal vs vertical scaling for n8n depends on your current workload, growth trajectory, and operational capabilities. Vertical scaling delivers simplicity and cost efficiency for moderate volumes. Horizontal scaling unlocks massive throughput and high availability for enterprise demands.
Start simple. Measure actual performance. Scale when evidence supports it. The best infrastructure matches your real needs, not theoretical future requirements. Your automations deserve a foundation that grows with your business without creating unnecessary complexity today.
Next Steps: What Now?
- Audit your current n8n execution volumes and workflow complexity to establish baselines.
- Choose appropriate VPS hosting that supports your initial vertical scaling needs.
- Configure PostgreSQL instead of SQLite for any production deployment.
- Set up basic monitoring to track CPU, memory, and execution metrics.
- Plan your horizontal scaling threshold based on projected growth rates.
- Document your scaling strategy for team alignment and future reference.



