🚀[FEA]: Add TensorRT compilation utility and hybrid Warp example#1565
🚀[FEA]: Add TensorRT compilation utility and hybrid Warp example#1565manmeet3591 wants to merge 2 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Manmeet Singh <manmeet20singh11@gmail.com>
This commit introduces a new inference utility module that provides support for compiling PhysicsNeMo models to TensorRT using Torch-TensorRT. It also adds a minimal example demonstrating a hybrid inference pipeline that combines NVIDIA Warp for geometric processing (neighbor search) with TensorRT for accelerated neural network execution. Signed-off-by: Manmeet Singh <manmeet20singh11@gmail.com>
Greptile SummaryThis PR adds a
Important Files Changed
Reviews (1): Last reviewed commit: "🚀[FEA]: Add TensorRT compilation utilit..." | Re-trigger Greptile |
| # limitations under the License. | ||
|
|
||
| import logging | ||
| from typing import Any, List, Optional, Union |
There was a problem hiding this comment.
Missing
Set import causes NameError at import time
Set is used in the type annotation on line 34 (Optional[Set[torch.dtype]]) but is not imported from typing. Because Python evaluates function annotations eagerly at definition time (without from __future__ import annotations), importing this module will immediately raise NameError: name 'Set' is not defined. Union is also imported but never used.
| from typing import Any, List, Optional, Union | |
| from typing import Any, List, Optional, Set |
| queries = torch.randn(1000, 3, device=device) | ||
| radius = 0.1 | ||
|
|
||
| # 3. Geometric Processing with Warp (Neighbor Search) |
There was a problem hiding this comment.
radius_search_warp does not accept a device keyword argument
The function signature in physicsnemo/models/figconvnet/warp_neighbor_search.py is radius_search_warp(points, queries, radius, grid_dim=...) — it derives the device directly from the input tensors and has no device parameter. Passing device=device.type will raise TypeError: radius_search_warp() got an unexpected keyword argument 'device' at runtime.
| queries = torch.randn(1000, 3, device=device) | |
| radius = 0.1 | |
| # 3. Geometric Processing with Warp (Neighbor Search) | |
| neighbor_index, neighbor_dist, neighbor_offset = radius_search_warp( | |
| points, queries, radius | |
| ) |
| # limitations under the License. | ||
|
|
||
| import logging | ||
| from typing import Any, List, Optional, Union |
peterdsharpe
left a comment
There was a problem hiding this comment.
Hi @manmeet3591,
Thank you for opening this PR!
While this contribution is interesting, I'm not sure that it belongs in PhysicsNeMo-Core - it:
a) adds a new optional dependency of TensorRT that is used in this example, and we'd like to keep the dependency list for PhysicsNeMo minimal
b) the main compilation logic in inference.py is basically a thin wrapper around torch_tensorrt.compile(), which could be instead used directly by end-users in their downstream training utilities.
For example, by analogy: we don't include any PhysicsNeMo-specific wrappers for torch.compile support, yet many downstream users torch.compile() their PhysicsNeMo models in their own scripts.
Is there an angle of the value proposition here that I'm not fully understanding?
| adjacency relationships, which is used internally for all computations. (See the | ||
| dedicated | ||
| [`physicsnemo.mesh.neighbors._adjacency.py`](physicsnemo/mesh/neighbors/_adjacency.py) | ||
| [`physicsnemo.mesh.neighbors._adjacency.py`](./neighbors/_adjacency.py) |
There was a problem hiding this comment.
Please remove these unrelated changes, or bring them in with a separate PR.
| wp_points: wp.array(dtype=wp.vec3), | ||
| wp_queries: wp.array(dtype=wp.vec3), | ||
| wp_launch_device: wp.context.Device | None, | ||
| wp_launch_device: wp.Device | None, |
There was a problem hiding this comment.
This is a good idea, and fixes a deprecation warning, but is unrelated to the main PR here - please bring these changes in a separate PR.
| def compile_to_trt( | ||
| model: torch.nn.Module, | ||
| input_signature: List[torch.Tensor], | ||
| enabled_precisions: Optional[Set[torch.dtype]] = None, |
There was a problem hiding this comment.
| enabled_precisions: Optional[Set[torch.dtype]] = None, | |
| enabled_precisions: Set[torch.dtype] | None = None, |
Modernizes type-hint syntax
|
|
||
| def compile_to_trt( | ||
| model: torch.nn.Module, | ||
| input_signature: List[torch.Tensor], |
There was a problem hiding this comment.
| input_signature: List[torch.Tensor], | |
| input_signature: list[torch.Tensor], |
| trt_model = torch_tensorrt.compile(model, **compile_spec) | ||
| logger.info("TensorRT compilation successful.") | ||
| return trt_model | ||
| except Exception as e: |
There was a problem hiding this comment.
I'd recommend allowing this to fail, rather than re-raising here. If we do choose to re-raise, perhaps we can narrow scope to tighter than a bare Exception?
| import torch | ||
|
|
||
| try: | ||
| import torch_tensorrt |
There was a problem hiding this comment.
violates repo-wide optional import conventions; will trip up importlinter
|
Hi @manmeet3591 I want to echo @peterdsharpe:
It's an interesting direction. I think it's worth exploring, if we can see the benefit? I am certainly grateful you've started this conversation. |
This PR adds a new inference optimization capability to PhysicsNeMo and a corresponding example for hybrid Warp + TensorRT execution.
New Features
physicsnemo/utils/inference.pycontaining acompile_to_trtfunction. This utility wrapstorch_tensorrtto simplify the process of optimizing PhysicsNeMo models for high-performance inference.examples/minimal/inference/torch_trt_warp_inference.py. This example demonstrates how to integrate NVIDIA Warp (for geometric tasks like neighbor search) with a TensorRT-optimized neural network in a single, zero-copy pipeline.Why this is useful
Many Physics-AI models require complex geometric preprocessing (best handled by Warp) and high-speed neural network inference (best handled by TensorRT). This PR provides the necessary utilities and patterns to build such hybrid pipelines efficiently.