Performance Testing
The performance tools repository is included as a github submodule in this project. The performance tools enable you to test the pipeline system performance on various hardware.
Benchmark Quick Start command
make update-submodules
update-submodules
ensures all submodules are initialized, updated to their latest remote versions, and ready for use.
make benchmark-quickstart
- Run headless (no display needed:
RENDER_MODE=0
)- Use full pipeline (
PIPELINE_SCRIPT=obj_detection_age_prediction.sh
)- Target GPU by default (
DEVICE_ENV=res/all-gpu.env
)- Generate benchmark metrics
- Run
make consolidate-metrics
automaticallyUnderstanding Benchmarking Types
Before running benchmark commands, make sure you already configured python and its dependencies. Visit the Performance tools installation guide HERE
Default benchmark command
make update-submodules
update-submodules
ensures all submodules are initialized, updated to their latest remote versions, and ready for use.
make benchmark
-
RENDER_MODE=0
-
PIPELINE_SCRIPT=yolo11n.sh
-
DEVICE_ENV=res/all-cpu.env
-
PIPELINE_COUNT=1
You can override these values through Environment Variables.
List of EVs:
Variable | Description | Values |
---|---|---|
BATCH_SIZE_DETECT |
number of frames batched together for a single inference to be used in gvadetect batch-size element | 0-N |
BATCH_SIZE_CLASSIFY |
number of frames batched together for a single inference to be used in gvaclassify batch-size element | 0-N |
RENDER_MODE |
for displaying pipeline and overlay CV metadata | 1, 0 |
PIPELINE_COUNT |
number of Automated Self Checkout Docker container instances to launch | Ex: 1 |
PIPELINE_SCRIPT |
pipeline script to run. | yolo11n_effnetb0.sh, obj_detection_age_prediction.sh, etc. |
DEVICE_ENV |
device to use for classification and detection | res/all-cpu.env, res/all-gpu.env, res/det-gpu_class-npu.env, etc. |
Note:
Higher thePIPELINE_COUNT
, higher the stress on the system.
Increasing this value will run more parallel pipelines, increasing resource usage and testing system
Note
The first time running this command may take few minutes. It will build all performance tools containers
After running the following commands, you will find the results in performance-tools/benchmark-scripts/results/
folder.
Benchmark 2
pipelines in parallel:
make PIPELINE_COUNT=2 benchmark
Benchmark command with environment variable overrides
make PIPELINE_SCRIPT=yolo11n_effnetb0.sh DEVICE_ENV=res/all-gpu.env PIPELINE_COUNT=1 benchmark
Benchmark command for full pipeline (age prediction+object classification) using GPU
make PIPELINE_SCRIPT=obj_detection_age_prediction.sh DEVICE_ENV=res/all-gpu.env PIPELINE_COUNT=1 benchmark
obj_detection_age_prediction.sh
runs TWO video streams in parallel even with PIPELINE_COUNT=1:
Stream 1: Object detection + classification on retail video
Stream 2: Face detection + age/gender prediction on age prediction video
Create a consolidated metrics file
After running the benchmark command run this command to see the benchmarking results:
make consolidate-metrics
metrics.csv
provides a summary of system and pipeline performance, including FPS, latency, CPU/GPU utilization, memory usage, and power consumption for each benchmark run.
It helps evaluate hardware efficiency and resource usage during automated self-checkout pipeline tests.
Benchmark Stream Density
To test the maximum amount of Automated Self Checkout containers/pipelines that can run on a given system you can use the TARGET_FPS environment variable. Default is to find the container threshold over 14.95 FPS with the yolo11n.sh pipeline. You can override these values through Environment Variables.
List of EVs:
Variable | Description | Values |
---|---|---|
TARGET_FPS |
threshold value for FPS to consider a valid stream | Ex. 14.95 |
OOM_PROTECTION |
flag to enable/disable OOM checks before scaling the pipeline (enabled by default) | 1, 0 |
Note:
An OOM crash occurs when a system or application tries to use more memory (RAM) than is available, causing the operating system to forcibly terminate processes to free up memory.
IfOOM_PROTECTION
is set to 0, the system may crash or become unresponsive, requiring a hard reboot.
make benchmark-stream-density
You can check the output results for performance metrics in the results
folder at the root level. Also, the stream density script will output the results in the console:
Total averaged FPS per stream: 15.210442307692306 for 26 pipeline(s)
Change the Target FPS value:
make TARGET_FPS=13.5 benchmark-stream-density
Environment variable overrides can also be added to the command
make PIPELINE_SCRIPT=yolo11n_effnetb0.sh TARGET_FPS=13.5 benchmark-stream-density
Alternatively you can directly call the benchmark.py. This enables you to take advantage of all performance tools parameters. More details about the performance tools can be found HERE
cd performance-tools/benchmark-scripts && python benchmark.py --compose_file ../../src/docker-compose.yml --target_fps 14
Plot Utilization Graphs
After running a benchmark, you can generate a consolidated CPU, NPU, and GPU usage graph based on the collected logs using:
make plot-metrics
plot_metrics.png
) under the benchmark
directory, showing:
🧠 CPU Usage Over Time
⚙️ NPU Utilization Over Time
🎮 GPU Usage Over Time for each device found