feature: support native YOLO .pt models while ensuring compatibility with Torchvision models by kashtennyson · Pull Request #495 · JdeRobot/PerceptionMetrics

kashtennyson · 2026-03-24T09:10:31Z

Description

This PR adds support for loading native Ultralytics YOLOv8 .pt models while ensuring a consistent interface for the rest of the library. This is a fix for #449

The Problem:
Native YOLO .pt models often return a tuple (inference_tensor, loss_tensor) rather than a raw tensor, which causes "too many values to unpack" errors in the inference and eval methods. Additionally, these models frequently use float16 (Half) precision, leading to DType mismatches with input images or NMS kernel errors on certain backends.

The Solution:
Following previous feedback, I have centralized the fix within the TorchImageDetectionModel class. I implemented a local Adapter class (DetectionModelWrapper) that standardizes the model's behavior at the source:

Tuple Unpacking: Automatically extracts the primary detection tensor.
Input Alignment: Automatically casts input images to match the model's native dtype (fixing "Float vs Half" errors).
Output Alignment: Ensures results are returned as float32 to maintain compatibility with torchvision.ops.nms.
Graceful Fallback: Wrapped the .pt loading logic to provide a clear error message suggesting the installation of ultralytics if it is missing.

This PR Supersedes #469. It implements a more stable version by ensuring compatibility with Torchvision models along with the Ultralytics YOLO models.

Architectural Question for Maintainers

"I have implemented the DetectionModelWrapper as a local class within the __init__ method of TorchImageDetectionModel to keep the fix strictly within the requested section and ensure that the normalization is context-specific to the model instance.

Do you prefer this local encapsulation, or would you like me to refactor the wrapper into a private, module-level class (e.g., _ModelNormalizationWrapper) at the top of the file to keep the __init__ method more concise?"

dpascualhe · 2026-03-25T20:32:21Z

Hi, thanks for your contribution! I'll review the PR thoroughly when I can since this is an important upgrade.

kashtennyson · 2026-03-26T05:58:21Z

Alright @dpascualhe. Thanks for the update!

I am also currently working on a broader refactor to provide global .pt support across all tasks (Detection, Segmentation, and LiDAR) by centralizing the loading and normalization logic into a shared BaseTorchModel utility. So, your guidance and feedback is crucial for these architectural decisions. Looking forward to your thoughts!

kashtennyson · 2026-04-16T12:37:32Z

Hi @dpascualhe, I am still waiting on this PR's evaluation from your side since I plan to extend the support for .pt files across all perception tasks by centralizing the logic in a similar way. Review it whenever you can. Thanks!

dpascualhe

Looks good and works for me! Can you get rid of the model_dtype extraction and subsequent casting of the input tensor during inference? I'd rather keep things as they are in that regard, so that an error is raised if they don't match for whatever reason. Upon changing that we can merge. Also, resolve conflicts with current master branch.

…lization

…with Torchvision models

kashtennyson · 2026-04-18T21:14:52Z

Thanks for the review! I have performed the requested changes. However, to keep the solution intact I have added an explicit .float() during model initialization to avoid Float vs Half precision error that occurs with .pt models during inference. Let me know if you’d prefer to remove it or handle the casting differently. Thanks!

kashtennyson force-pushed the issue-449 branch from 2470f75 to 1ae5a00 Compare March 25, 2026 18:59

dpascualhe self-requested a review March 25, 2026 19:08

dpascualhe self-assigned this Mar 25, 2026

kashtennyson mentioned this pull request Apr 7, 2026

Fix YOLOv8 .pt inference crashes and improve output handling #526

Closed

dpascualhe requested changes Apr 17, 2026

View reviewed changes

kashtennyson added 3 commits April 19, 2026 02:11

feature: support native YOLO .pt models with centralized output norma…

7920a04

…lization

feature: support native YOLO .pt models while ensuring compatibility …

98c326b

…with Torchvision models

Removed dtype extraction and subsequent casting in DetectionModelWrapper

4283183

kashtennyson force-pushed the issue-449 branch from 1ae5a00 to 4283183 Compare April 18, 2026 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: support native YOLO .pt models while ensuring compatibility with Torchvision models#495

feature: support native YOLO .pt models while ensuring compatibility with Torchvision models#495
kashtennyson wants to merge 3 commits intoJdeRobot:masterfrom
kashtennyson:issue-449

kashtennyson commented Mar 24, 2026

Uh oh!

dpascualhe commented Mar 25, 2026

Uh oh!

kashtennyson commented Mar 26, 2026

Uh oh!

kashtennyson commented Apr 16, 2026

Uh oh!

dpascualhe left a comment •

edited

Loading

Uh oh!

kashtennyson commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kashtennyson commented Mar 24, 2026

Description

Architectural Question for Maintainers

Uh oh!

dpascualhe commented Mar 25, 2026

Uh oh!

kashtennyson commented Mar 26, 2026

Uh oh!

kashtennyson commented Apr 16, 2026

Uh oh!

dpascualhe left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kashtennyson commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dpascualhe left a comment •

edited

Loading