feature: support native YOLO .pt models while ensuring compatibility with Torchvision models#495
feature: support native YOLO .pt models while ensuring compatibility with Torchvision models#495kashtennyson wants to merge 3 commits intoJdeRobot:masterfrom
Conversation
|
Hi, thanks for your contribution! I'll review the PR thoroughly when I can since this is an important upgrade. |
|
Alright @dpascualhe. Thanks for the update! I am also currently working on a broader refactor to provide global |
|
Hi @dpascualhe, I am still waiting on this PR's evaluation from your side since I plan to extend the support for |
There was a problem hiding this comment.
Looks good and works for me! Can you get rid of the model_dtype extraction and subsequent casting of the input tensor during inference? I'd rather keep things as they are in that regard, so that an error is raised if they don't match for whatever reason. Upon changing that we can merge. Also, resolve conflicts with current master branch.
|
Thanks for the review! I have performed the requested changes. However, to keep the solution intact I have added an explicit |
Description
This PR adds support for loading native Ultralytics YOLOv8
.ptmodels while ensuring a consistent interface for the rest of the library. This is a fix for #449The Problem:
Native YOLO
.ptmodels often return a tuple(inference_tensor, loss_tensor)rather than a raw tensor, which causes "too many values to unpack" errors in theinferenceandevalmethods. Additionally, these models frequently usefloat16(Half) precision, leading toDTypemismatches with input images or NMS kernel errors on certain backends.The Solution:
Following previous feedback, I have centralized the fix within the
TorchImageDetectionModelclass. I implemented a local Adapter class (DetectionModelWrapper) that standardizes the model's behavior at the source:dtype(fixing "Float vs Half" errors).float32to maintain compatibility withtorchvision.ops.nms..ptloading logic to provide a clear error message suggesting the installation ofultralyticsif it is missing.This PR Supersedes #469. It implements a more stable version by ensuring compatibility with
Torchvisionmodels along with theUltralyticsYOLO models.Architectural Question for Maintainers