NGINX Unit: new kid on the PSGI block

Professional Services
Custom Software
Managed Hosting
System Administration

See my CV here.
Send inquiries here.

Open Source:
tCMS
trog-provisioner
Playwright for Perl
Selenium::Client
Audit::Log
rprove
Net::Openssh::More

cPanel & WHM Plugins:
Better Postgres for cPanel
cPanel iContact Plugins

NGINX Unit: new kid on the PSGI block 🔗

1697150850

🏷️ blog 🏷️ www 🏷️ perl 🏷️ nginx

For those of you not aware, there has been a new entry in the PSGI server software field, this time by NGINX. Let's dig in.

Performance Comparisons

Low Spec
# It may shock you to find I have worked with shared hosts.
Env: 4GB ram, 2cpu.

# This is basically saturating this host.
# we can do more, but we start falling down in ways ab stops working.
ab -n10000 -k -c1000 $APP_URI

Starman:
Requests per second:    198.94 [#/sec] (mean)
Time per request:       5026.727 [ms] (mean)
Time per request:       5.027 [ms] (mean, across all concurrent requests)
Transfer rate:          3835.30 [Kbytes/sec] received

uWSGI (I could only get to ~5k reqs w/ 800 requestors before it fell over):
Requests per second:    74.44 [#/sec] (mean)
Time per request:       10746.244 [ms] (mean)
Time per request:       13.433 [ms] (mean, across all concurrent requests)
Transfer rate:          1481.30 [Kbytes/sec] received

nginx-unit:
Requests per second:    275.60 [#/sec] (mean)
Time per request:       3628.429 [ms] (mean)
Time per request:       3.628 [ms] (mean, across all concurrent requests)
Transfer rate:          5333.22 [Kbytes/sec] received

This generally maps to my experiences thus far with starman and uWSGI -- while the latter has more features, and performs better under nominal conditions, it handles extreme load quite poorly. Unit was clearly superior by a roughly 60% margin or better regardless of the level of load, and could be pushed a great deal farther before falling down than starman or uWSGI. Much of this was due to much more efficient memory usage. So, let's try things out on some (relatively) big iron.

High Spec
# You will be pleased to know I'm writing this off on my taxes
Env: 64GB ram, 48vcpu

# This time we went straight with 100 workers each, and 1k concurrent connections each making 10 requests each.
# We switched to using wrk, because ab fails when you push it very hard.

Unit:
 wrk -t10 -c1000 -d 2m http://localhost:5001/
Running 2m test @ http://localhost:5001/
  10 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   239.60ms  188.61ms   2.00s    90.16%
    Req/Sec   335.32    180.29     1.26k    62.37%
  203464 requests in 2.00m, 799.57MB read
  Socket errors: connect 0, read 9680, write 14750, timeout 608
Requests/sec:   1694.14
Transfer/sec:      6.66MB

uWSGI:
wrk -t10 -c1000 -d 2m http://localhost:5000/
Running 2m test @ http://localhost:5000/
  10 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    60.56ms  112.75ms   1.99s    93.42%
    Req/Sec   268.76    188.69     2.66k    61.73%
  309011 requests in 2.00m, 1.17GB read
  Socket errors: connect 0, read 309491, write 0, timeout 597
Requests/sec:   2573.82
Transfer/sec:      9.97MB

Starman:
 wrk -t10 -c1000 -d 2m http://localhost:5000/
Running 2m test @ http://localhost:5000/
  10 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    24.90ms   47.06ms   1.99s    90.56%
    Req/Sec     4.04k   415.85     4.67k    92.86%
  480564 requests in 2.00m, 1.84GB read
  Socket errors: connect 0, read 0, write 0, timeout 58
Requests/sec:   4002.30
Transfer/sec:     15.73MB

These were surprising results. While unit outperformed uwsgi handily, both were obviously falling down with quite a few failed requests. Meanwhile starman handled them without breaking a sweat, and absolutely trounced both competitors. Japanese perl still winning, clearly. Let's have a look at the automatic-scaling features of uWSGI and unit.

Auto-Scaling!
# Same as above, but with cheaper=1
uwsgi:
wrk -t10 -c1000 -d 2m http://localhost:5000/
Running 2m test @ http://localhost:5000/
  10 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    72.26ms   98.85ms   1.99s    95.18%
    Req/Sec   212.68    157.93   810.00     60.82%
  196466 requests in 2.00m, 760.87MB read
  Socket errors: connect 0, read 196805, write 0, timeout 305
Requests/sec:   1635.89
Transfer/sec:      6.34MB

# Same as above, but processes are now set to 5min and 100 max.
unit:
wrk -t10 -c1000 -d 2m http://localhost:5001/
Running 2m test @ http://localhost:5001/
  10 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   329.91ms   67.84ms   1.14s    81.80%
    Req/Sec   277.56    142.12   720.00     69.52%
  10000 requests in 2.00m, 39.28MB read
  Socket errors: connect 0, read 6795, write 0, timeout 0
Requests/sec:     83.26
Transfer/sec:    334.92KB

This is just so hilariously bad that I can't help but think I'm holding it wrong for unit, but I can't see anything to mitigate this in the documentation. If you need auto-scaling workloads, obviously uWSGI is still the place to be. Even upping the ratio of 'stored' jobs to max to 80% isn't enough to beat uwsgi.

Feature Comparisons

Here are the major features I use in uWSGI, and their counterpart in unit:

Max Requests per worker : Application Request Limits
lazy-apps : This is not configurable, but it seems it's COWing strategy isn't causing trouble for me, and is very cheap. YMMV.
cheaper: Application processes
fs-reload : This is the main thing missing from unit.

Both are configurable via APIs, which makes deploying new sites via orchestration frameworks like kubernetes and so forth straightforward.

Conclusion

Given uWSGI is in "Maintenance only" mode (and has been for some time), I would assume it's well on its way to being put out to pasture. NGINX is quite well funded and well liked, for good cause. Unit gets me the vast majority of what I wanted out of uWSGI, and performs a heck of a lot better, save for when scaling is a concern. Not sure how sold I am on the name, given that's also what systemd calls each service, but I'll take what I can get. I also suspect that given the support this has, the performance problems versus something like starman will be resolved in time. For performance constrained environments where scaling is unlikely, unit gets my enthusiastic endorsement.

Postscript: The details

All testing was done on Ubuntu jammy using the official unit, starman and uwsgi packages. Both hosts were KVM virtual machines. The testing was simply loading a minimally configured tCMS homepage, which is a pure perl PSGI app using Text::XSlate. It's the software hosting this website.

uWSGI configs used (with modifications detailed above) are here.
Starman configuration beyond defaults and specifying job count was not done.
Unit configs used (with modifications detailed above) are here.

25 most recent posts older than 1697150850

Prev Next Size: Jump to:

POTZREBIE

NGINX Unit: new kid on the PSGI block 🔗 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 1697150850

Performance Comparisons

Feature Comparisons

Conclusion

Postscript: The details

NGINX Unit: new kid on the PSGI block 🔗

1697150850