[Doc]: update user guide: "finetune.md, inference.md" (#9578)

DanShouzhu · zytx121 · web-flow · commit 2341347e9da5 · 2023-02-23T13:01:39.000+08:00
Signed-off-by: cecil_dan &lt;danshouzhu@outlook.com&gt;
Co-authored-by: Yue Zhou &lt;592267829@qq.com&gt;
diff --git a/docs/en/user_guides/finetune.md b/docs/en/user_guides/finetune.md
@@ -12,7 +12,7 @@ Take the finetuning process on Cityscapes Dataset as an example, the users need
 
 ## Inherit base configs
 
-To release the burden and reduce bugs in writing the whole configs, MMDetection V2.0 support inheriting configs from multiple existing configs. To finetune a Mask RCNN model, the new config needs to inherit
+To release the burden and reduce bugs in writing the whole configs, MMDetection V3.0 support inheriting configs from multiple existing configs. To finetune a Mask RCNN model, the new config needs to inherit
 `_base_/models/mask-rcnn_r50_fpn.py` to build the basic structure of the model. To use the Cityscapes Dataset, the new config can also simply inherit `_base_/datasets/cityscapes_instance.py`. For runtime settings such as logger settings, the new config needs to inherit `_base_/default_runtime.py`. For training schedules, the new config can to inherit `_base_/schedules/schedule_1x.py`. These configs are in the `configs` directory and the users can also choose to write the whole contents rather than use inheritance.
 
 ```python
@@ -56,7 +56,7 @@ model = dict(
 
 ## Modify dataset
 
-The users may also need to prepare the dataset and write the configs about dataset, refer to [Customize Datasets](../advanced_guides/customize_dataset.md) for more detail. MMDetection V3.0 already supports VOC, WIDERFACE, COCO, LIVS, OpenImages, DeepFashion and Cityscapes Dataset.
+The users may also need to prepare the dataset and write the configs about dataset, refer to [Customize Datasets](../advanced_guides/customize_dataset.md) for more detail. MMDetection V3.0 already supports VOC, WIDERFACE, COCO, LIVS, OpenImages, DeepFashion, Objects365, and Cityscapes Dataset.
 
 ## Modify training schedule
 
diff --git a/docs/en/user_guides/inference.md b/docs/en/user_guides/inference.md
@@ -5,7 +5,7 @@ This note will show how to inference, which means using trained models to detect
 
 In MMDetection, a model is defined by a [configuration file](config.md) and existing model parameters are saved in a checkpoint file.
 
-To start with, we recommend [Faster RCNN](../../../configs/faster_rcnn) with this [configuration file](../../../configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py) and this [checkpoint file](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth). It is recommended to download the checkpoint file to `checkpoints` directory.
+To start with, we recommend [Faster RCNN](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn) with this [configuration file](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py) and this [checkpoint file](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth). It is recommended to download the checkpoint file to `checkpoints` directory.
 
 ## High-level APIs for inference
 
@@ -32,8 +32,8 @@ visualizer = VISUALIZERS.build(model.cfg.visualizer)
 # The dataset_meta is loaded from the checkpoint and
 # then pass to the model in init_detector
 visualizer.dataset_meta = model.dataset_meta
-# Ttest a single image and show the results
 
+# Test a single image and show the results
 img = 'test.jpg'  # or img = mmcv.imread(img), which will only load it once
 result = inference_detector(model, img)
 
@@ -63,7 +63,7 @@ visualizer = VISUALIZERS.build(model.cfg.visualizer)
 # then pass to the model in init_detector
 visualizer.dataset_meta = model.dataset_meta
 
-# The interval of show (s), 0 is block
+# The interval of show (ms), 0 is block
 wait_time = 1
 
 video_reader = mmcv.VideoReader('video.mp4')
@@ -84,14 +84,14 @@ for frame in track_iter_progress(video_reader):
 cv2.destroyAllWindows()
 ```
 
-A notebook demo can be found in [demo/inference_demo.ipynb](../../../demo/inference_demo.ipynb).
+A notebook demo can be found in [demo/inference_demo.ipynb](https://github.com/open-mmlab/mmdetection/blob/3.x/demo/inference_demo.ipynb).
 
 Note:  `inference_detector` only supports single-image inference for now.
 
 ## Demos
 
 We also provide three demo scripts, implemented with high-level APIs and supporting functionality codes.
-Source codes are available [here](../../../demo).
+Source codes are available [here](https://github.com/open-mmlab/mmdetection/blob/3.x/demo).
 
 ### Image demo
 
diff --git a/docs/zh_cn/user_guides/finetune.md b/docs/zh_cn/user_guides/finetune.md
@@ -1,4 +1,4 @@
-# 模型微调（待更新）
+# 模型微调
 
 在 COCO 数据集上预训练的检测器可以作为其他数据集（例如 CityScapes 和 KITTI 数据集）优质的预训练模型。
 本教程将指导用户如何把 [ModelZoo](../model_zoo.md) 中提供的模型用于其他数据集中并使得当前所训练的模型获得更好性能。
@@ -12,12 +12,13 @@
 
 ## 继承基础配置
 
-为了减轻编写整个配置的负担并减少漏洞的数量， MMDetection V2.0 支持从多个现有配置中继承配置信息。微调 MaskRCNN 模型的时候，新的配置信息需要使用从 `_base_/models/mask_rcnn_r50_fpn.py`中继承的配置信息来构建模型的基本结构。当使用 Cityscapes 数据集时，新的配置信息可以简便地从`_base_/datasets/cityscapes_instance.py`中继承。对于训练过程的运行设置部分，新配置需要从 `_base_/default_runtime.py`中继承。这些配置文件`configs`的目录下，用户可以选择全部内容的重新编写而不是使用继承方法。
+为了减轻编写整个配置的负担并减少漏洞的数量， MMDetection V3.0 支持从多个现有配置中继承配置信息。微调 MaskRCNN 模型的时候，新的配置信息需要使用从 `_base_/models/mask_rcnn_r50_fpn.py` 中继承的配置信息来构建模型的基本结构。当使用 Cityscapes 数据集时，新的配置信息可以简便地从`_base_/datasets/cityscapes_instance.py` 中继承。对于训练过程的运行设置部分，例如 `logger settings`，配置文件可以从 `_base_/default_runtime.py` 中继承。对于训练计划的配置则可以从`_base_/schedules/schedule_1x.py` 中继承。这些配置文件存放于 `configs` 目录下，用户可以选择全部内容的重新编写而不是使用继承方法。
 
 ```python
 _base_ = [
     '../_base_/models/mask_rcnn_r50_fpn.py',
-    '../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py'
+    '../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py',
+    '../_base_/schedules/schedule_1x.py'
 ]
 ```
 
@@ -27,7 +28,6 @@ _base_ = [
 
 ```python
 model = dict(
-    pretrained=None,
     roi_head=dict(
         bbox_head=dict(
             type='Shared2FCBBoxHead',
@@ -55,7 +55,7 @@ model = dict(
 
 ## 数据集的修改
 
-用户可能还需要准备数据集并编写有关数据集的配置。目前 MMDetection V2.0 的配置文件已经支持 VOC、WIDER FACE、COCO 和 Cityscapes Dataset 的数据集信息。
+用户可能还需要准备数据集并编写有关数据集的配置，可在 [Customize Datasets](../advanced_guides/customize_dataset.md) 中获取更多信息。目前 MMDetection V3.0 的配置文件已经支持 VOC、WIDERFACE、COCO、LIVS、OpenImages、DeepFashion、Objects365 和 Cityscapes Dataset 的数据集信息。
 
 ## 训练策略的修改
 
@@ -64,23 +64,32 @@ model = dict(
 ```python
 # 优化器
 # batch size 为 8 时的 lr 配置
-optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
-optimizer_config = dict(grad_clip=None)
-# 学习策略
-lr_config = dict(
-    policy='step',
-    warmup='linear',
-    warmup_iters=500,
-    warmup_ratio=0.001,
-    step=[7])
-# lr_config 中的 max_epochs 和 step 需要针对自定义数据集进行专门调整
-runner = dict(max_epochs=8)
-log_config = dict(interval=100)
+optim_wrapper = dict(optimizer=dict(lr=0.01))
+
+# 学习率
+param_scheduler = [
+    dict(
+        type='LinearLR', start_factor=0.001, by_epoch=False, begin=0, end=500),
+    dict(
+        type='MultiStepLR',
+        begin=0,
+        end=8,
+        by_epoch=True,
+        milestones=[7],
+        gamma=0.1)
+]
+
+# 设置 max epoch
+train_cfg = dict(max_epochs=8)
+
+# 设置 log config
+default_hooks = dict(logger=dict(interval=100)),
+
 ```
 
 ## 使用预训练模型
 
-如果要使用预训练模型时，可以在 `load_from` 中查阅新的配置信息，用户需要在训练开始之前下载好需要的模型权重，从而避免在训练过程中浪费了宝贵时间。
+如果要使用预训练模型，可以在 `load_from` 中查阅新的配置信息，用户需要在训练开始之前下载好需要的模型权重，从而避免在训练过程中浪费了宝贵时间。
 
 ```python
 load_from = 'https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'  # noqa
diff --git a/docs/zh_cn/user_guides/inference.md b/docs/zh_cn/user_guides/inference.md
@@ -1,97 +1,114 @@
-# 使用已有模型在标准数据集上进行推理（待更新）
+# 使用已有模型在标准数据集上进行推理
 
-推理是指使用训练好的模型来检测图像上的目标。在 MMDetection 中，一个模型被定义为一个配置文件和对应的存储在 checkpoint 文件内的模型参数的集合。
+MMDetection 提供了许多预训练好的检测模型，可以在 [Model Zoo](https://mmdetection.readthedocs.io/en/latest/model_zoo.html) 查看具体有哪些模型.
 
-首先，我们建议从 [Faster RCNN](https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn) 开始，其 [配置](https://github.com/open-mmlab/mmdetection/blob/master/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py) 文件和 [checkpoint](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) 文件在此。
+推理具体指使用训练好的模型来检测图像上的目标，本文将会展示具体步骤。
+
+在 MMDetection 中，一个模型被定义为一个[配置文件](config.md)和对应被存储在 checkpoint 文件内的模型参数的集合。
+
+首先，我们建议从 [Faster RCNN](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn) 开始，其 [配置](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py) 文件和 [checkpoint](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) 文件在此。
 我们建议将 checkpoint 文件下载到 `checkpoints` 文件夹内。
 
 ## 推理的高层编程接口
 
 MMDetection 为在图片上推理提供了 Python 的高层编程接口。下面是建立模型和在图像或视频上进行推理的例子。
 
 ```python
-from mmdet.apis import init_detector, inference_detector
+import cv2
 import mmcv
+from mmcv.transforms import Compose
+from mmengine.utils import track_iter_progress
+from mmdet.registry import VISUALIZERS
+from mmdet.apis import init_detector, inference_detector
+
 
 # 指定模型的配置文件和 checkpoint 文件路径
-config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+config_file = 'configs/faster_rcnn/faster-rcnn_r50-fpn_1x_coco.py'
 checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
 
 # 根据配置文件和 checkpoint 文件构建模型
 model = init_detector(config_file, checkpoint_file, device='cuda:0')
 
+# 初始化可视化工具
+visualizer = VISUALIZERS.build(model.cfg.visualizer)
+# 从 checkpoint 中加载 Dataset_meta，并将其传递给模型的 init_detector
+visualizer.dataset_meta = model.dataset_meta
+
 # 测试单张图片并展示结果
 img = 'test.jpg'  # 或者 img = mmcv.imread(img)，这样图片仅会被读一次
 result = inference_detector(model, img)
-# 在一个新的窗口中将结果可视化
-model.show_result(img, result)
-# 或者将可视化结果保存为图片
-model.show_result(img, result, out_file='result.jpg')
-
-# 测试视频并展示结果
-video = mmcv.VideoReader('video.mp4')
-for frame in video:
-    result = inference_detector(model, frame)
-    model.show_result(frame, result, wait_time=1)
-```
-
-jupyter notebook 上的演示样例在 [demo/inference_demo.ipynb](https://github.com/open-mmlab/mmdetection/blob/master/demo/inference_demo.ipynb) 。
 
-## 异步接口-支持 Python 3.7+
+# 显示结果
+img = mmcv.imread(img)
+img = mmcv.imconvert(img, 'bgr', 'rgb')
 
-对于 Python 3.7+，MMDetection 也有异步接口。利用 CUDA 流，绑定 GPU 的推理代码不会阻塞 CPU，从而使得 CPU/GPU 在单线程应用中能达到更高的利用率。在推理流程中，不同数据样本的推理和不同模型的推理都能并发地运行。
 
-您可以参考 `tests/async_benchmark.py` 来对比同步接口和异步接口的运行速度。
+visualizer.add_datasample(
+    'result',
+    img,
+    data_sample=result,
+    draw_gt=False,
+    show=True)
 
-```python
-import asyncio
-import torch
-from mmdet.apis import init_detector, async_inference_detector
-from mmdet.utils.contextmanagers import concurrent
-
-async def main():
-    config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
-    checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
-    device = 'cuda:0'
-    model = init_detector(config_file, checkpoint=checkpoint_file, device=device)
+# 测试视频并展示结果
+# 构建测试 pipeline
+model.cfg.test_dataloader.dataset.pipeline[0].type = 'LoadImageFromNDArray'
+test_pipeline = Compose(model.cfg.test_dataloader.dataset.pipeline)
 
-    # 此队列用于并行推理多张图像
-    streamqueue = asyncio.Queue()
-    # 队列大小定义了并行的数量
-    streamqueue_size = 3
+# 可视化工具在第33行和35行已经初完成了初始化，如果直接在一个 jupyter nodebook 中运行这个 demo，
+# 这里则不需要再创建一个可视化工具了。
+# 初始化可视化工具
+visualizer = VISUALIZERS.build(model.cfg.visualizer)
+# 从 checkpoint 中加载 Dataset_meta，并将其传递给模型的 init_detector
+visualizer.dataset_meta = model.dataset_meta
 
-    for _ in range(streamqueue_size):
-        streamqueue.put_nowait(torch.cuda.Stream(device=device))
+# 显示间隔 (ms), 0 表示暂停
+wait_time = 1
 
-    # 测试单张图片并展示结果
-    img = 'test.jpg'  # or 或者 img = mmcv.imread(img)，这样图片仅会被读一次
+video = mmcv.VideoReader('video.mp4')
 
-    async with concurrent(streamqueue):
-        result = await async_inference_detector(model, img)
+cv2.namedWindow('video', 0)
 
-    # 在一个新的窗口中将结果可视化
-    model.show_result(img, result)
-    # 或者将可视化结果保存为图片
-    model.show_result(img, result, out_file='result.jpg')
+for frame in track_iter_progress(video_reader):
+    result = inference_detector(model, frame, test_pipeline=test_pipeline)
+    visualizer.add_datasample(
+        name='video',
+        image=frame,
+        data_sample=result,
+        draw_gt=False,
+        show=False)
+    frame = visualizer.get_image()
+    mmcv.imshow(frame, 'video', wait_time)
 
+cv2.destroyAllWindows()
+```
 
-asyncio.run(main())
+Jupyter notebook 上的演示样例在 [demo/inference_demo.ipynb](https://github.com/open-mmlab/mmdetection/blob/3.x/demo/inference_demo.ipynb) 。
 
-```
+注意: `inference_detector` 目前仅支持单张图片的推理。
 
 ## 演示样例
 
-我们还提供了三个演示脚本，它们是使用高层编程接口实现的。 [源码在此](https://github.com/open-mmlab/mmdetection/tree/master/demo) 。
+我们还提供了三个演示脚本，它们是使用高层编程接口实现的。 [源码在此](https://github.com/open-mmlab/mmdetection/blob/3.x/demo) 。
 
 ### 图片样例
 
-这是在单张图片上进行推理的脚本，
+这是在单张图片上进行推理的脚本。
+
+```shell
+python demo/image_demo.py \
+    ${IMAGE_FILE} \
+    ${CONFIG_FILE} \
+    [--weights ${WEIGHTS}] \
+    [--device ${GPU_ID}] \
+    [--pred-score-thr ${SCORE_THR}]
+```
 
 运行样例：
 
 ```shell
 python demo/image_demo.py demo/demo.jpg \
-    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
+    configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
     --weights checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
     --device cpu
 ```
@@ -113,7 +130,7 @@ python demo/webcam_demo.py \
 
 ```shell
 python demo/webcam_demo.py \
-    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
+    configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
     checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
 ```
 
@@ -137,7 +154,7 @@ python demo/video_demo.py \
 
 ```shell
 python demo/video_demo.py demo/demo.mp4 \
-    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
+    configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
     checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
     --out result.mp4
 ```
@@ -164,7 +181,7 @@ python demo/video_gpuaccel_demo.py \
 
 ```shell
 python demo/video_gpuaccel_demo.py demo/demo.mp4 \
-    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
+    configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
     checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
     --nvdecode --out result.mp4
 ```