Vision Analysis

Methodology

Every number on this site comes from a reviewed, reproducible benchmark run. This page documents exactly how those numbers are produced.

From run to published number

Benchmarks are produced by the open-source vision-analysis-benchmark harness. Each run emits a single JSON file that records the result together with the exact LibreYOLO version and commit used, so the number can be reproduced later.

Result files are submitted by pull request to submissions/ and validated against a published schema. After review, the canonical verified dataset is rebuilt and the website renders exclusively from that snapshot. Raw submissions stay in the repository for provenance.

Accuracy

All accuracy numbers are computed with pycocotools COCOeval in bounding-box mode on coco-val2017-mini500, a frozen, reproducible 500-image subset of COCO val2017 (80 classes). The harness records all 12 standard COCO metrics: mAP@50-95, mAP@50, mAP@75, mAP for small/medium/large objects, and the AR (average recall) variants. The site headlines mAP@50-95, the strictest and most standard of these.

Evaluation settings follow the convention used by the original model releases: confidence threshold 0.001, NMS IoU 0.6, maximum 300 detections per image, and the model's native input size (recorded per run, typically 640).

Speed

Latency is measured end to end at batch size 1 and split into preprocess, inference, and postprocess phases. This means NMS and decoding costs are included: a model with a heavy postprocessing stage will show honest, higher end-to-end latency here than in inference-only marketing numbers.

Before timing, the harness warms the model up (10 iterations on CUDA and Apple MPS, 3 on CPU). Each measurement uses device synchronization (torch.cuda.synchronize or the MPS equivalent) around time.perf_counter, so asynchronous GPU work cannot leak across timestamps. Per-image timings are aggregated as mean, standard deviation, p50, p95, and p99; throughput (FPS) is derived from them.

Environment capture

Every submission embeds the full environment it was measured on: GPU model, driver and CUDA version, CPU model and core count, RAM, plus Python, PyTorch, ONNX Runtime, and LibreYOLO versions. Two results are only compared on the site when they share the same hardware and runtime combination.

Runtimes covered by the harness today: PyTorch (CPU and NVIDIA CUDA), ONNX Runtime (CPU and NVIDIA CUDA), and TensorRT (NVIDIA CUDA, FP32 and FP16). OpenVINO is not yet part of the harness.

Weights provenance and parity

Each result declares where its weights came from: original (the authors' release), converted (key-remapped into LibreYOLO format, learned parameters unchanged), or retrained. To verify that conversion does not change behavior, we continuously cross-check measured accuracy against the numbers claimed by the original authors on the Port Fidelity page.

Known limitations

  • Timing focuses on batch size 1 (the latency-critical deployment case); large-batch throughput is not currently measured.
  • Peak GPU memory (VRAM) and host RAM are not currently captured in submissions, so memory figures are omitted from the site.
  • YOLO-NAS is excluded from harness runs because its weights are gated by license terms that prevent redistribution.
  • Hardware coverage is community-driven: a model/hardware cell is empty until a reviewed submission for that combination lands.

Run any model with one line

LibreYOLO has the best catalogue of state-of-the-art detectors, all behind one MIT-licensed Python API.

from libreyolo import LibreYOLO, SAMPLE_IMAGE

# LibreYOLO has the best catalogue of state-of-the-art models.
model = LibreYOLO("LibreRFDETRl.pt")           # RF-DETR-L (transformer flagship)
results = model(SAMPLE_IMAGE, save=True)        # run inference, save the annotated image

# Swap in any other model, same one-line API (weights auto-download):
#   LibreYOLO("LibreYOLO9c.pt")      # YOLO9-C
#   LibreYOLO("LibreYOLOXx.pt")      # YOLOX-X
#   LibreYOLO("LibreDFINEx.pt")      # D-FINE-X
#   LibreYOLO("LibreRTDETRr50.pt")   # RT-DETR-R50