Deployment

One of our main objectives of this class and its group work was to deliver a service that was actually value-adding to a real-life problem. Our solution had to be as practical and realistic as possible. Thus, it was clear early on, that the deployment would play a crucial role in achieving this goal. We did, however, manage to deliver our service in a form deployable on a single Jetson Nano including backend as well as frontend. Hence, we are able to present a solution that is easily deployable on a single small and cost-effective device.

This chapter elaborates on how the deployment process of the QSROA works and how the device can be optimally set-up for use.

Hardware Setup

With packing stations of QSRs being the main area of application of the QSROA, we suggest a somewhat standardized setup of the hardware.
The setup essentially revolves around the camera. We suggest that the camera and therefore the Jetson Nano is mounted at a fixed height of one meter above the counter. Further, we advise to place the camera in parallel to the surface of the counter.

The Jetson itself needs to be connected to a local network via LAN. The packing area should preferably be well lit to avoid extensive shadows. There are no recommendations regarding the surface structure or the color of the counter.

Setup Hardware

Docker

Originally, we planned to do the complete developing process on other computers to subsequently build the Docker containers and upload them to a container registry. On the Jetson Nano one would then only call docker-compose to pull and start the containers.

However, this concept led us into some problems as certain steps of our build process require the DeepStream and CUDA SDK to be present on the build system. The latter of which is only compatible with certain Nvidia GPUs. It may have been possible to build an adequate virtual machine to overcome the issue, anyhow the effort required for this would not have been justifiable.

We therefore went with an adapted concept.
This concept includes three distinct Docker Containers: one for the frontend, one for the backend and one for the inference. Now as before, both frontend and backend are built on PCs. By contrast, the inference container is being built on the Jetson Nano.
In the following the containers are depicted in more detail.

The creation of the container for the fronted starts with the conduction of a production build of the React application in a node.js container. Therefore, the needed packages (package.json) are installed and the source code of the application is copied into the container. The resulting production build is then copied into a nginx container, which we use to host the react website. nginx is a webserver software.

For the backend, the source-files as well as the requirements.txt are copied into a python container. Then, the required packages are installed with pip. In a final step, the backend is executed via gunicorn, a http-server provider for python.

Nvidia actually supplies a Docker base image for DeepStream, which is used for the inference.
All necessary libraries are installed in the Docker Container and gst-python is being pulled and built from the GStreamer Github repo. Next, the files from the inference folder are copied into the container, the requirements.txt installed and run_server.py executed. run_server.py executes the deepstream pipeline and takes care of the Socket.IO connection to the backend.

GitLab Pipeline

Originally it was planned to use a GitLab Runner for automated Docker builds. Merge requests in the master branch would trigger the pipeline to automatically build a new Docker container and upload it to the container registry.
While the pipeline worked well for the frontend, the pushing already caused trouble. As the inference container must be built with scripts on the Jetson Nano either way, we decided to take another path.
Thus, we ended up outright working with build scripts. Of the two alternatives, it came out as significantly less effortful. The fact that a limited number of people worked on this project in a brief time span supports this decision.

Build Scripts

As elaborated in the preceding paragraphs, both frontend and backend Docker containers are built on PCs. Unfortunately, KIT GitLab does not provide a container registry, so that we had to extend the project by a GitHub account, named "aissgroup".

In the respective folders of frontend and backend, a script is available named deploy-arm.sh. These scripts build Docker Containers for the Jetson and push them to the container registry.
For testing purposes on Windows, there are additional scripts available named deploy.sh. They differ in a way that the Docker Containers are now created for Windows.

Unlike front- and backend, the inference must be built with scripts directly on the Jetson. Thus, the script named build.sh is to be executed on the device. The script creates a temporary build folder, in which all relevant files such as Github projects, model weights and labels are copied. The first thing built is the yolov5s.engine. Subsequently a libnvdsinfer_custom_impl_Yolo.so Library is built by the script. Finally, both files and labels.txt are copied into a folder named output.

To finally deploy the whole service on the Jetson, two more scripts need to be used. They can be found on the top level of the project folder and are named build_all.sh and start_all.sh. Both of them must be executed on the Jetson.

build_all.sh deletes all existing Docker Containers and images to be able to conduct a clean, new build and avoid accidental uses of old images. Following, both the frontend and the backend container get pulled from the repository and the inference Docker Container is being built.

After successfully running the build script, the start_all.sh script is to be executed. The script stops all running containers and deletes them. It then starts the three containers of the QSROA application and exposes the following ports in the network:

5001 Backend Port
1234 Frontend Port
8554 RTSP Port

It was decided to go with two separate scripts for the build and the execution as it enables repetitive starts of the application without rebuilding.