GPU环境使用ocrv5的server模型，异常RUNTIME_EXCEPTION:Failed to allocate memory for requested buffer of size 8797618176 #549

loneWolf1127 · 2025-09-12T02:59:48Z

loneWolf1127
Sep 12, 2025

问题描述 / Problem Description

GPU环境使用ocrv5的server模型，异常RUNTIME_EXCEPTION：
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/app/llmdoc/code/gpu_service/onnx_test.py", line 34, in
result = ocr_engine(img_url)
^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/rapidocr/main.py", line 116, in call
img, det_res = self.get_det_res(img, op_record)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/rapidocr/main.py", line 222, in get_det_res
det_res = self.text_det(img)
^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/rapidocr/ch_ppocr_det/main.py", line 59, in call
preds = self.session(prepro_img)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/rapidocr/inference_engine/onnxruntime/main.py", line 93, in call
raise ONNXRuntimeError(error_info) from e
rapidocr.inference_engine.onnxruntime.main.ONNXRuntimeError: Traceback (most recent call last):
File "/opt/conda/lib/python3.11/site-packages/rapidocr/inference_engine/onnxruntime/main.py", line 90, in call
return self.session.run(self.get_output_names(), input_dict)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 270, in run
return self._sess.run(output_names, input_feed, run_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv.86' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 8797618176

运行环境 / Runtime Environment

Docker version 26.1.0
cuda 12.4

复现代码 / Reproduction Code

det_model_dir = rf"{get_work_path()}/models/table_ocr/ocr/det/v5_server_det.onnx"
rec_model_dir = rf"{get_work_path()}/models/table_ocr/ocr/rec/v5_rec_server_infer.onnx"
rec_char_dict_path = rf"{get_work_path()}/models/table_ocr/ocr/rec/ppocr_keys.txt"
params={
        "Global.use_cls": False,
        "Det.ocr_version": OCRVersion.PPOCRV5,
        "Det.model_type": ModelType.SERVER,
        "Rec.ocr_version": OCRVersion.PPOCRV5,
        "Rec.model_type": ModelType.SERVER
    }
params["EngineConfig.onnxruntime.use_cuda"] = True
params["EngineConfig.onnxruntime.cuda_ep_cfg.device_id"] = 0
engine = RapidOCR(params=params)

img_url = "https://github.com/RapidAI/RapidOCR/blob/main/python/tests/test_files/ch_en_num.jpg?raw=true"
img_url = r'D:\docdata\table-issue\复杂表格-56.png'
result = engine(img_url)
print(result)

可能解决方案 / Possible solutions

SWHL · 2025-09-12T03:37:47Z

SWHL
Sep 12, 2025
Maintainer

GPU环境下不建议使用onnxruntime作为推理引擎。因此我这里并没有测过。
cpu下有问题吗？

1 reply

loneWolf1127 Sep 12, 2025
Author

GPU环境下建议使用哪个推理引擎？cpu下没问题

SWHL · 2025-09-12T03:57:14Z

SWHL
Sep 12, 2025
Maintainer

paddle或者torch

…

---- 回复的原邮件 ---- | 发件人 | ***@***.***> | | 发送日期 | 2025年09月12日 11:49 | | 收件人 | RapidAI/RapidOCR ***@***.***> | | 抄送人 | SWHL ***@***.***>, Comment ***@***.***> | | 主题 | Re: [RapidAI/RapidOCR] GPU环境使用ocrv5的server模型，异常RUNTIME_EXCEPTION:Failed to allocate memory for requested buffer of size 8797618176 (Discussion #549) | GPU环境下建议使用哪个推理引擎？cpu下没问题 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

0 replies

dodysw3 · 2025-10-25T10:09:58Z

dodysw3
Oct 25, 2025

I have similar issue, but I cannot change to different inference engine as using it under immich docker app, I noticed that when forced to run the rapidocr inference sequentially, avoiding any form of concurrency, the memory allocation issue is happening less.

1 reply

SWHL Oct 27, 2025
Maintainer

You are right. In concurrent scenearios, inference OCR dose indeed encounter many problems.

Since I haven't touched upon specific issues here for the time being, I haven't provided an effective solution yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GPU环境使用ocrv5的server模型，异常RUNTIME_EXCEPTION:Failed to allocate memory for requested buffer of size 8797618176 #549

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

GPU环境使用ocrv5的server模型，异常RUNTIME_EXCEPTION:Failed to allocate memory for requested buffer of size 8797618176 #549

Uh oh!

loneWolf1127 Sep 12, 2025

问题描述 / Problem Description

运行环境 / Runtime Environment

复现代码 / Reproduction Code

可能解决方案 / Possible solutions

Replies: 3 comments · 2 replies

Uh oh!

SWHL Sep 12, 2025 Maintainer

Uh oh!

loneWolf1127 Sep 12, 2025 Author

Uh oh!

SWHL Sep 12, 2025 Maintainer

Uh oh!

dodysw3 Oct 25, 2025

Uh oh!

SWHL Oct 27, 2025 Maintainer

loneWolf1127
Sep 12, 2025

Replies: 3 comments 2 replies

SWHL
Sep 12, 2025
Maintainer

loneWolf1127 Sep 12, 2025
Author

SWHL
Sep 12, 2025
Maintainer

dodysw3
Oct 25, 2025

SWHL Oct 27, 2025
Maintainer