If you run gunicorn without any configuration, it can only handle one request a time.
# running without special options
gunicorn example:app
I recommend two ways to increase concurrency of gunicorn.
First of all, we should increase number of gunicorn worker processes . By default, only one worker process is used to handle requests.
In the following example, we use -w 8
to specify number of worker processes. In general, the more cpu cores you have, the more processes you can specify. In my experience, process number can be twice as much as cpu cores.
# on my 4 cpu cores computer, I let gunicorn run with 8 worker processes
gunicorn -w 8 example:app
Now we can handle 8 requests a time. That's not enough. We can further utilize multithreading.
The default number of threads per process is 1. In general, the more IO jobs you do, the more threads you should use. IO jobs are things like accessing database and reading/writing files. If you have no idea about it, just set a number like 50, and tweak later.
In the following example, we use --threads 50
to specify number of threads per process:
# 8 worker processes, 50 threads per process.
gunicorn -w 8 --threads 50 example:app
Finally, we can handle 400 requests a time. 400
equals 8 * 50
.
Like the previous chapter, multiprocessing is still used, but instead of threads we use gevent here. Gevent is a coroutine-based Python networking library. Gunicorn gvent worker theoretically can have more concurrency than multithreading worker. You cant test both of them on your server and choose the one with better performance.
Install gevent if you do not already have one:
pip install gevent
In the following example, we use -k gevent --worker-connections 1000
to specify number of gevent concurrency per process:
# 8 worker processes, 1000 gevent concurrency per process.
gunicorn -w 8 -k gevent --worker-connections 1000 example:app
Now we can handle 8000 requests a time. 8000
equals 8 * 1000
.
8000
is a theoretical value, whether you can reach it depends on your usage scenarios. Like multithreading, the more IO jobs you do, the better gevent works. Just test it on your environment.