On a Raspberry Pi 5 with the Hailo-8 accelerator, YOLOv9-C leads accuracy at 54.8 mAP@50-95 and 21.2 FPS. The fastest measured model is YOLOX-Tiny at 195.6 FPS and 33.5 mAP. Every row here runs quantized to INT8 on the NPU, so these accuracy numbers are the post-quantization result, not the FP32 figures you would see on a GPU. mAP is shown in percent form.
The Hailo-8 is an INT8 NPU. Weights and activations are quantized to 8-bit before they run on the accelerator, so every mAP below already reflects that conversion. Read the accuracy table as the ceiling for this device, then drop to the speed table for models fast enough to stream. The NPU makes even the accuracy leader usable at real frame rates, which a bare Pi 5 CPU cannot do.
| # | Model | mAP@50-95 | FPS | ms/image | Params (M) | License |
|---|---|---|---|---|---|---|
| 1 | YOLOv9-C | 5477.0 | 21.2 | 47.08 | 25.5 | MIT |
| 2 | YOLOv9-M | 5330.0 | 30.1 | 33.23 | 20.1 | MIT |
| 3 | YOLO-NAS-L | 5242.0 | 15.4 | 64.89 | 67.0 | non-permissive |
| 4 | YOLO-NAS-M | 4735.0 | 21.7 | 46.05 | 51.1 | non-permissive |
| 5 | YOLO-NAS-S | 4446.0 | 39.7 | 25.20 | 19.0 | non-permissive |
| 6 | YOLOX-M | 4368.0 | 34.8 | 28.76 | 25.3 | Apache-2.0 |
| 7 | YOLOX-S | 4109.0 | 95.8 | 10.44 | 9.0 | Apache-2.0 |
| 8 | YOLOX-Tiny | 3353.0 | 195.6 | 5.11 | 5.1 | Apache-2.0 |
| 9 | YOLOv9-S | 3203.0 | 36.3 | 27.52 | 7.2 | MIT |
| 10 | YOLOv9-T | 2575.0 | 76.5 | 13.08 | 2.0 | MIT |
The accuracy-speed frontier
Five models hold the measured frontier: YOLOX-Tiny, YOLOX-S, YOLO-NAS-S, YOLOv9-M, and YOLOv9-C. It runs from YOLOX-Tiny at 33.5 mAP and 195.6 FPS up to YOLOv9-C at 54.8 mAP and 21.2 FPS. YOLOv9-M sits close behind the leader at 53.3 mAP while running faster at 30.1 FPS. Anything off this list is beaten on both accuracy and speed by a model on it.
| # | Model | mAP@50-95 | FPS | ms/image | Params (M) | License |
|---|---|---|---|---|---|---|
| 1 | YOLOX-Tiny | 3353.0 | 195.6 | 5.11 | 5.1 | Apache-2.0 |
| 2 | YOLOX-S | 4109.0 | 95.8 | 10.44 | 9.0 | Apache-2.0 |
| 3 | YOLOv9-T | 2575.0 | 76.5 | 13.08 | 2.0 | MIT |
| 4 | YOLO-NAS-S | 4446.0 | 39.7 | 25.20 | 19.0 | non-permissive |
| 5 | YOLOv9-S | 3203.0 | 36.3 | 27.52 | 7.2 | MIT |
| 6 | YOLOX-M | 4368.0 | 34.8 | 28.76 | 25.3 | Apache-2.0 |
| 7 | YOLOv9-M | 5330.0 | 30.1 | 33.23 | 20.1 | MIT |
| 8 | YOLO-NAS-M | 4735.0 | 21.7 | 46.05 | 51.1 | non-permissive |
| 9 | YOLOv9-C | 5477.0 | 21.2 | 47.08 | 25.5 | MIT |
| 10 | YOLO-NAS-L | 5242.0 | 15.4 | 64.89 | 67.0 | non-permissive |
Picks by latency budget
Under 50 ms per image, YOLOv9-C is the most accurate option at 54.8 mAP, finishing at 47.08 ms. Because that model already clears the tightest budget here, loosening to 100 or 500 ms buys nothing more accurate: YOLOv9-C stays the pick across all three. If you need headroom for other work on the NPU, YOLOv9-M gives up little at 53.3 mAP and 30.1 FPS, and YOLOX-S trades down to 41.1 mAP for 95.8 FPS. All figures are INT8 on the Hailo-8.
Every number on this page comes from the verified dataset: same 500-image COCO val2017 slice, conf 0.001, IoU 0.6, max 300 detections, pycocotools mAP, identical protocol across all hardware and runtimes. The full protocol is on the methodology page. To rerun this comparison with your own filters, open compare. Accuracy is measured on LibreYOLO retrained checkpoints; other weight sources can yield different values.
