DEIM-L vs RT-DETRv2-R101: same-protocol benchmark

Verdict

DEIM-L edges RT-DETRv2-R101 on accuracy, 57.8 vs 56.8 mAP@50-95 on an RTX 5070 Ti, and does it with far less model: 31.24M parameters to 76.56M and 245 vs 456 MB peak VRAM. RT-DETRv2-R101 wins raw PyTorch speed, 25.7 vs 18.6 FPS, but that lead does not survive: on TensorRT FP32 and on Jetson Orin, DEIM-L is the faster model.

DEIM-L (31.24M parameters, Apache-2.0) and RT-DETRv2-R101 (76.56M, Apache-2.0) are both large detectors evaluated at 640 px on COCO val2017. DEIM-L carries under half the parameters. Both have verified rows on desktop GPU and Jetson Orin, so this comparison covers where the speed ranking changes.

Metric	DEIM-L	RT-DETRv2-R101
mAP@50-95	5783.0	5677.0
mAP@50	7555.0	7399.0
mAP small	4515.0	3920.0
FPS (mean)	18.6	25.7
Total ms/image	53.76	38.96
Inference ms	46.27	34.32
Peak VRAM (MB)	245	456
Params (M)	31.2	76.6
GFLOPs	91.0	259.0
Input size	640	640
License	Apache-2.0	Apache-2.0

DEIM-L vs RT-DETRv2-R101 on NVIDIA RTX 5070 Ti, PyTorch FP32, batch 1. mAP shown in percent form.

Live chartverified data

Accuracy vs parameters on COCO val2017. DEIM-L and RT-DETRv2-R101 highlighted against the full field.

Accuracy

mAP is shown in percent form. DEIM-L measures 57.8 mAP@50-95 to RT-DETRv2-R101's 56.8, a 1.0 point edge, and the gap is widest on small objects: 45.2 vs 39.2 mAP_small, a 15.18% relative gap. DEIM-L reaches that accuracy on far less compute, extracting 189.95% more mAP per GFLOP.

Speed

On the RTX 5070 Ti in PyTorch, RT-DETRv2-R101 runs 25.7 FPS to DEIM-L's 18.6, so it is 27.54% faster there. The two are level on ONNX Runtime at 50.5 vs 49.3 FPS.

The speed verdict does not survive conversion or the edge. On TensorRT FP32 DEIM-L runs 75.7 FPS to RT-DETRv2-R101's 64.8, and on Jetson Orin it leads 4.8 to 2.8 FPS. Pick your speed number from the runtime and device you will actually deploy.

License and provenance

DEIM-L license: Apache-2.0
RT-DETRv2-R101 license: Apache-2.0
DEIM-L release: 2024-12-05
RT-DETRv2-R101 release: 2024-05-01
Evaluated weights: LibreYOLO retrained checkpoints

When to pick which

Pick DEIM-L in almost every case here: it is more accurate, uses far less memory, is more compute-efficient, and is faster once you convert to TensorRT or move to Jetson Orin. Pick RT-DETRv2-R101 only for raw PyTorch serving on desktop GPU, where its throughput lead holds. Both are Apache-2.0, so licensing does not force the choice.

Try both with LibreYOLO

# pip install libreyolo
from libreyolo import LibreYOLO

deim   = LibreYOLO("LibreDEIMl.pt")          # 31.24M params, 57.8 mAP@50-95
rtdetr = LibreYOLO("LibreRTDETRv2r101.pt")   # 76.56M params, 56.8 mAP@50-95

results = deim("image.jpg")

Every number on this page comes from the verified dataset: same 500-image COCO val2017 slice, conf 0.001, IoU 0.6, max 300 detections, pycocotools mAP, identical protocol across all hardware and runtimes. The full protocol is on the methodology page. To rerun this comparison with your own filters, open compare. Accuracy is measured on LibreYOLO retrained checkpoints; other weight sources can yield different values.

DEIM-L vs RT-DETRv2-R101: same-protocol benchmark

Accuracy

Speed

When to pick which

Try both with LibreYOLO

Run any model with one line