ONNX export of MaskRCNN: inference fails when batch size > 1 and no detections · pytorch/vision#2309

(5 留言) (0 反應) (0 負責人)Python (6,858 fork)batch import

bughelp wantedmodule: onnxtopic: object detection

倉庫指標

Star: (15,050 star)
PR 合併指標: (平均合併 12天 8小時) (30 天內合併 14 個 PR)

描述

🐛 Bug

When running inference with MaskRCNN exported to ONNX with batch size bigger than one, an exception is thrown on images with no detections.

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running SplitToSequence node. Name:'SplitToSequence_4427' Status Message: split_size_sum (75) != split_dim_size (72)

To Reproduce

Steps to reproduce the behavior: NOTE: input_tensor is of size (4,6,1024,1024) -> batch size is 4

Load and export a pretrained MaskRCNN model:

model = torchvision.models.detection.maskrcnn_resnet50_fpn(
            pretrained=False,
            min_size=1024, max_size=1024,
            pretrained_backbone=False,
            num_classes=num_classnames + 1,  # + background class
            image_mean=image_mean,
            image_std=image_std,
)
torch.onnx.export(
        model,
        input_tensor.float(),
        onnx_model_filepath,
        export_params=True,
        opset_version=12,
        do_constant_folding=False,
        input_names=["images_tensors"],
        output_names=["boxes", "labels", "scores", "masks"],
        dynamic_axes={"images_tensors": [0, 1, 2, 3], "boxes": [0, 1], "labels": [0],
                      "scores": [0], "masks": [0, 1, 2, 3]},
)

Infer on an image that has detections (image's values are in range 0.0-1.0):

input_array = input_tensor.cpu().numpy()
ort_session = onnxruntime.InferenceSession(onnx_model_filepath)
ort_inputs = {"images_tensors": input_array}
ort_outs = ort_session.run(None, ort_inputs)

Works correctly.

Infer on an image that will have no detections, ex. a random one or a black one (making sure that image's values are still in range 0.0-1.0):

random_tensor = torch.randn(input_tensor.shape) 
# also tried:
# random_tensor = torch.zeros(input_tensor.shape) 
random_array = random_tensor.cpu().numpy()
ort_session = onnxruntime.InferenceSession(onnx_model_filepath)
ort_inputs = {"images_tensors": random_array}
ort_outs = ort_session.run(None, ort_inputs)

This throws the exception:

[E:onnxruntime:, sequential_executor.cc:281 Execute] Non-zero status code returned while running SplitToSequence node. Name:'SplitToSequence_4427' Status Message: split_size_sum (61) != split_dim_size (15)
Traceback (most recent call last):
  File "/.../maskrcnn_deployment.py", line 1087, in <module>
    main()
  File "/.../maskrcnn_deployment.py", line 948, in main
    onnx_prediction = infer_onnx_model_on_single_image(
  File "/.../maskrcnn_deployment.py", line 527, in infer_onnx_model_on_single_image
    ort_outs = ort_session.run(None, ort_inputs)
  File "/.../python3.8/site-packages/onnxruntime/capi/session.py", line 111, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running SplitToSequence node. Name:'SplitToSequence_4427' Status Message: split_size_sum (61) != split_dim_size (15)

Expected behavior

I would expect analogical behaviour to the one with batch size = 1: empty prediction arrays returned and no exceptions.

Environment

PyTorch version: 1.6.0.dev20200526+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 20.04 LTS
GCC version: (Ubuntu 9.3.0-10ubuntu2) 9.3.0
CMake version: version 3.16.3

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration: 
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
GPU 2: GeForce RTX 2080 Ti
GPU 3: GeForce RTX 2080 Ti

Nvidia driver version: 440.64
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.4

Versions of relevant libraries:
[pip3] numpy==1.18.4
[pip3] torch==1.6.0.dev20200526+cu101
[pip3] torchvision==0.7.0.dev20200526+cu101
[conda] Could not collect

Also:

ONNX_runtime and ONNX_runtime_gpu==1.3.0
ONNX==1.7.0

Additional context

With batch size = 1 no such error occurs. Also, the number in SplitToSequence error varies depending on the batch size used, ex. if batch size was 2, the error would be:

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running SplitToSequence node. Name:'SplitToSequence_3575' Status Message: split_size_sum (22) != split_dim_size (6)

Also, it's connected to my previous issue, #2251

貢獻者指南

研究方向: 研究當批次大小大於1時，MaskRCNN的ONNX圖中的SplitToSequence節點。追蹤當某些影像沒有檢測結果時，檢測輸出（框、標籤、分數、遮罩）的處理方式。檢查動態形狀處理或SplitToSequence操作是否錯誤地假設每個批次元素具有統一的檢測數量。
技術棧: pythonpytorch
領域: computer visionbackend
議題類型: 錯誤
難度: 3
預計時間: 半天
活動狀態: 活躍
清晰度: 清晰
前置要求: PythonPyTorchONNX
新手友善度: 45