-
Notifications
You must be signed in to change notification settings - Fork 9.5k
Open
Description
🔎 Search before asking
- I have searched the PaddleOCR Docs and found no similar bug report.
- I have searched the PaddleOCR Issues and found no similar bug report.
- I have searched the PaddleOCR Discussions and found no similar bug report.
🐛 Bug (问题描述)
在Atlas 800I A2(64G)上参考https://www.paddlepaddle.org.cn/documentation/docs/zh/hardware_support/npu/install_cn.html 官方指导文档完成环境部署:
1、拉取昇腾 NPU 开发镜像
2、创建容器
3、安装paddlepaddle 、 paddle-custom-npu
4、基础功能检查均正常
接下来参考https://www.paddleocr.ai/main/version3.x/pipeline_usage/instructions/benchmark.html?h=benchmark 里的benchmark测试脚本进行测试,结果两台相同机型的机器跑出来的结果有很大的差异:机器A耗时25s,机器B既然耗时140s。如下表所示:
| Level | Operation | Time (ms)-A | Time (ms)-B |
|---|---|---|---|
| 1 | _OCRPipeline.predict | 25473.5812769969 | 140310.851979814 |
| 2 | Layer | 25473.5812769969 | 140310.851979814 |
| Core | 2730.83080103388 | 4887.04342823475 | |
| Other | 22742.7504759631 | 135423.808551579 |
耗时差异主要体现在Other这一部分。查询benchmark测试脚本得知Other耗时含义:summary["other"] = summary["end_to_end"] - summary["core"]。
这部分耗时无明确的含义。
这一问题应该如何进行下一步排查或者如何解决?
🏃♂️ Environment (运行环境)
OS openEuler 22.03 (LTS-SP4)
docker 18.09.0
image ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84
CANN 8.0.RC2
python 3.10.16
paddle-custom-npu 0.0.0
paddle2onnx 1.3.1
paddleocr 3.3.1
paddlepaddle 3.2.1
paddlex 3.3.9 🌰 Minimal Reproducible Example (最小可复现问题的Demo)
from paddleocr import PaddleOCR, benchmark
image = "01.jpg"
gpu_id = 0
if __name__ == '__main__':
pipeline = PaddleOCR(lang="ch",
doc_orientation_classify_model_dir='./models/PP-LCNet_x1_0_doc_ori_infer/',
doc_unwarping_model_dir='./models/UVDoc_infer/',
textline_orientation_model_dir='./models/PP-LCNet_x1_0_textline_ori_infer/',
text_detection_model_dir='./models/PP-OCRv5_server_det_infer/',
text_recognition_model_dir='./models/PP-OCRv5_server_rec_infer/',
use_doc_orientation_classify=True, use_doc_unwarping=True, use_textline_orientation=True,
enable_mkldnn=False, precision='fp16', device='npu:' + str(gpu_id),
text_det_limit_type='max', text_det_limit_side_len=960
)
benchmark.start_warmup() # warmup开始
for _ in range(10):
pipeline.predict(image)
benchmark.stop_warmup() # warmup结束
for _ in range(10): # 开始正式测速
pipeline.predict(image)
benchmark.print_pipeline_data() # 打印汇总的benchmark数据
benchmark.save_pipeline_data("./benchmark") # 将benchmark数据保存至benchmark文件Metadata
Metadata
Assignees
Labels
No labels