Commit e6a1170
authored
Remove triton optimization config, causing error for multi gpu inference (#2079)
When running Triton inference for SID & Phishing detection pipeline using multi-gpu on `nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.11`. It result on segment fault. The TRT optimization line at the config.pbtxt of the models is causing `tritonserver:24.11` to fail with following error. This PR address the issue to run when all gpu is selected for inference.
> 2024-12-09 23:24:38.378753895 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running TRTKernel_graph_torch_jit_3139280210422962738_0 node. Name:'TensorrtExecutionProvider_TRTKernel_graph_torch_jit_3139280210422962738_0_0' Status Message: TensorRT EP execution context enqueue failed.
Closes #2028
## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.
Authors:
- Tad ZeMicheal (https://github.com/tzemicheal)
- David Gardner (https://github.com/dagardner-nv)
Approvers:
- https://github.com/hsin-c
- David Gardner (https://github.com/dagardner-nv)
URL: #20791 parent e3a4bf1 commit e6a1170
File tree
2 files changed
+0
-14
lines changed- models/triton-model-repo
- phishing-bert-onnx
- sid-minibert-onnx
2 files changed
+0
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
0 commit comments