nginx + Uvicorn + FastAPI + systemd

^{October 21st, 2023}

A cheat-sheet on how to run a FastAPI application within Uvicorn as a asgi-server under the systemd control and integrate it with nginx. Some more comments below (privacy, performance).

Create a virtual environment. Prerequisite (ubuntu):

apt install python3-venv

Installation of Uvicorn and FastAPI:

mkdir <application_dir>
cd <application_dir>/
python3 -m venv .
. ./bin/activate
pip install uvicorn
pip install fastapi

Create an application, e.g. hello.py:

from fastapi import FastAPI
app = FastAPI()

@app.get('/')
async def hello():
    return {"message": "hello"}

Monitoring via systemd:

/etc/systemd/system/uvicorn.service

[Unit]
Description=uvicorn daemon
After=network.target

[Service]
Type=exec
User=...
Group=...
WorkingDirectory=<application_dir>
Environment=...
#ExecStart=<application_dir>/bin/uvicorn --uds <unix_socket_path> --reload --root-path <mount_path> hello:app
ExecStart=<application_dir>/bin/uvicorn --uds <unix_socket_path> --workers <number_of_workers> --root-path <mount_path> hello:app
KillMode=mixed
PrivateTmp=true

[Install]
WantedBy=multi-user.target

where:

Type=exec is required as the Uvicorn can't tell the systemd it's state (ready, failed etc.)
unix_socket_path is a freely chosen UNIX socket path that will be access from the nginx; it must be writable by the User/Group as configured above; other ways of accepting connections are also possible: via a TCP socket or via a UNIX socket created by the systemd
mount_path is a URL prefix (if any) under which the application is visible from outside (as configured in the nginx); note the root-path does not affect the @app.get('/')-path in the application's code above, you need to take care of that by yourself
number_of_workers collides with reload; for productive environments a greater number of workers should be considered, while for development the reload option tracks changes in .py files and reloads the application automatically; choose one of the ExecStart lines as needed

Note: (TODO) access and error logs need to be configured here as well.

Activate and start the service:

systemctl enable --now uvicorn.service

Test the application (unix socket - Uvicorn - FastAPI):

curl -X GET --unix-socket <unix_socket_path> http://does-not-matter/

Integrate it with nginx:

http {

    upstream fastapi {
        server unix:<unix_socket_path>;
    }

}

server {

    location /... {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://fastapi/;

    }

}

Details on the proxy_pass:

Either a reference to the upstream or an in-line address.
Trailing slash (or rather: trailing path) makes nginx to interpret the address as a URI (URI rewriting will take place).
No trailing path (e.g. http://flask) makes nginx to forward the original URI as-is to flask.
Unix socket syntax would be http://unix:<unix_socket_path> or http://unix:<unix_socket_path>:/path/.
More here: ngx_http_proxy_module documentation

Privacy

Consider blocking an automatically generated documentation of your application, e.g. by setting the three endpoints to None when instantiating a FastAPI application app = FastAPI(openapi_url=None, docs_url=None, redoc_url=None). Another approach would be to restrict the access to these resources, like the one discussed on GitHub.

Performance

Given the application:

from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get("/")
async def root():
    sleep(1)
    return {"message": "hello"}

And the time measuring command:

$ time -p { seq 1 200 | \
  xargs -I % -n1 -P10 \
  sh -c 'curl -s -o /dev/null -X GET --unix-socket <unix_socket_path> http://does-not-matter/; echo -n %,;'; echo done; }

where:

time -p {} measures the execution time of a complex command inside the curly braces
seq 1 200 generates values from 1 to 200
xargs executes the command given with sh -c ...
-I % makes the number generated by seq to be inserted into the command in place of a % sign
-n1 takes one number generated by seq per command
-P10 allows only 10 commands to be running in parallel
sh -c is a trick to create another complex command
curl -s makes curl to be quiet
-o /dev/null ignores the resulting data
-X GET triggers an http GET method
--unix-socket <unix_socket_path> tells curl to connect to the UNIX socket instead to a TCP one
http://does-not-matter/ tells curl what host name and path to request from the server - it just needs to match the @app.get("/") decorator
echo -n %, will print the progress

Note: don't forget the semicolons. They are really required after both: the last command of the sh -c block and the last command of the time -p {} argument.

Results:

As expected, one worker needs slightly above 200 seconds to process the requests and 10 workers do it in slightly above 20 seconds.
Removing the sleep(1) delay in the code and even increasing the number of the calls to 1000 does not show any significant difference between 1 and 10 workers. On my machine, in both cases the 1000 requests require ca. 2.5 - 3 seconds.

Comparison to flask

Just after a few tests: I don't see huge differences. Actually, flask did provide a solution for injecting environment-specific data (like database credentials). It seems there is no solution like this in FastAPI:

  app.config.from_object('config')
  app.config.from_pyfile('config.py')

Instead, you may read such variables out of a file using additional code or libraries like pydantic as described in the offical docs.

Next: OpenVPN with global IPv6 addressing

Previous: Upload to S3 bucket from bash

Main Menu