Skip to content

Commit 2341347

Browse files
DanShouzhuzytx121
andauthored
[Doc]: update user guide: "finetune.md, inference.md" (#9578)
Signed-off-by: cecil_dan <[email protected]> Co-authored-by: Yue Zhou <[email protected]>
1 parent 3923329 commit 2341347

File tree

4 files changed

+105
-79
lines changed

4 files changed

+105
-79
lines changed

docs/en/user_guides/finetune.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Take the finetuning process on Cityscapes Dataset as an example, the users need
1212

1313
## Inherit base configs
1414

15-
To release the burden and reduce bugs in writing the whole configs, MMDetection V2.0 support inheriting configs from multiple existing configs. To finetune a Mask RCNN model, the new config needs to inherit
15+
To release the burden and reduce bugs in writing the whole configs, MMDetection V3.0 support inheriting configs from multiple existing configs. To finetune a Mask RCNN model, the new config needs to inherit
1616
`_base_/models/mask-rcnn_r50_fpn.py` to build the basic structure of the model. To use the Cityscapes Dataset, the new config can also simply inherit `_base_/datasets/cityscapes_instance.py`. For runtime settings such as logger settings, the new config needs to inherit `_base_/default_runtime.py`. For training schedules, the new config can to inherit `_base_/schedules/schedule_1x.py`. These configs are in the `configs` directory and the users can also choose to write the whole contents rather than use inheritance.
1717

1818
```python
@@ -56,7 +56,7 @@ model = dict(
5656

5757
## Modify dataset
5858

59-
The users may also need to prepare the dataset and write the configs about dataset, refer to [Customize Datasets](../advanced_guides/customize_dataset.md) for more detail. MMDetection V3.0 already supports VOC, WIDERFACE, COCO, LIVS, OpenImages, DeepFashion and Cityscapes Dataset.
59+
The users may also need to prepare the dataset and write the configs about dataset, refer to [Customize Datasets](../advanced_guides/customize_dataset.md) for more detail. MMDetection V3.0 already supports VOC, WIDERFACE, COCO, LIVS, OpenImages, DeepFashion, Objects365, and Cityscapes Dataset.
6060

6161
## Modify training schedule
6262

docs/en/user_guides/inference.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ This note will show how to inference, which means using trained models to detect
55

66
In MMDetection, a model is defined by a [configuration file](config.md) and existing model parameters are saved in a checkpoint file.
77

8-
To start with, we recommend [Faster RCNN](../../../configs/faster_rcnn) with this [configuration file](../../../configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py) and this [checkpoint file](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth). It is recommended to download the checkpoint file to `checkpoints` directory.
8+
To start with, we recommend [Faster RCNN](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn) with this [configuration file](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py) and this [checkpoint file](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth). It is recommended to download the checkpoint file to `checkpoints` directory.
99

1010
## High-level APIs for inference
1111

@@ -32,8 +32,8 @@ visualizer = VISUALIZERS.build(model.cfg.visualizer)
3232
# The dataset_meta is loaded from the checkpoint and
3333
# then pass to the model in init_detector
3434
visualizer.dataset_meta = model.dataset_meta
35-
# Ttest a single image and show the results
3635

36+
# Test a single image and show the results
3737
img = 'test.jpg' # or img = mmcv.imread(img), which will only load it once
3838
result = inference_detector(model, img)
3939

@@ -63,7 +63,7 @@ visualizer = VISUALIZERS.build(model.cfg.visualizer)
6363
# then pass to the model in init_detector
6464
visualizer.dataset_meta = model.dataset_meta
6565

66-
# The interval of show (s), 0 is block
66+
# The interval of show (ms), 0 is block
6767
wait_time = 1
6868

6969
video_reader = mmcv.VideoReader('video.mp4')
@@ -84,14 +84,14 @@ for frame in track_iter_progress(video_reader):
8484
cv2.destroyAllWindows()
8585
```
8686

87-
A notebook demo can be found in [demo/inference_demo.ipynb](../../../demo/inference_demo.ipynb).
87+
A notebook demo can be found in [demo/inference_demo.ipynb](https://github.com/open-mmlab/mmdetection/blob/3.x/demo/inference_demo.ipynb).
8888

8989
Note: `inference_detector` only supports single-image inference for now.
9090

9191
## Demos
9292

9393
We also provide three demo scripts, implemented with high-level APIs and supporting functionality codes.
94-
Source codes are available [here](../../../demo).
94+
Source codes are available [here](https://github.com/open-mmlab/mmdetection/blob/3.x/demo).
9595

9696
### Image demo
9797

docs/zh_cn/user_guides/finetune.md

Lines changed: 27 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# 模型微调(待更新)
1+
# 模型微调
22

33
在 COCO 数据集上预训练的检测器可以作为其他数据集(例如 CityScapes 和 KITTI 数据集)优质的预训练模型。
44
本教程将指导用户如何把 [ModelZoo](../model_zoo.md) 中提供的模型用于其他数据集中并使得当前所训练的模型获得更好性能。
@@ -12,12 +12,13 @@
1212

1313
## 继承基础配置
1414

15-
为了减轻编写整个配置的负担并减少漏洞的数量, MMDetection V2.0 支持从多个现有配置中继承配置信息。微调 MaskRCNN 模型的时候,新的配置信息需要使用从 `_base_/models/mask_rcnn_r50_fpn.py`中继承的配置信息来构建模型的基本结构。当使用 Cityscapes 数据集时,新的配置信息可以简便地从`_base_/datasets/cityscapes_instance.py`中继承。对于训练过程的运行设置部分,新配置需要从 `_base_/default_runtime.py`中继承。这些配置文件`configs`的目录下,用户可以选择全部内容的重新编写而不是使用继承方法。
15+
为了减轻编写整个配置的负担并减少漏洞的数量, MMDetection V3.0 支持从多个现有配置中继承配置信息。微调 MaskRCNN 模型的时候,新的配置信息需要使用从 `_base_/models/mask_rcnn_r50_fpn.py` 中继承的配置信息来构建模型的基本结构。当使用 Cityscapes 数据集时,新的配置信息可以简便地从`_base_/datasets/cityscapes_instance.py` 中继承。对于训练过程的运行设置部分,例如 `logger settings`,配置文件可以从 `_base_/default_runtime.py` 中继承。对于训练计划的配置则可以从`_base_/schedules/schedule_1x.py` 中继承。这些配置文件存放于 `configs` 目录下,用户可以选择全部内容的重新编写而不是使用继承方法。
1616

1717
```python
1818
_base_ = [
1919
'../_base_/models/mask_rcnn_r50_fpn.py',
20-
'../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py'
20+
'../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py',
21+
'../_base_/schedules/schedule_1x.py'
2122
]
2223
```
2324

@@ -27,7 +28,6 @@ _base_ = [
2728

2829
```python
2930
model = dict(
30-
pretrained=None,
3131
roi_head=dict(
3232
bbox_head=dict(
3333
type='Shared2FCBBoxHead',
@@ -55,7 +55,7 @@ model = dict(
5555

5656
## 数据集的修改
5757

58-
用户可能还需要准备数据集并编写有关数据集的配置。目前 MMDetection V2.0 的配置文件已经支持 VOC、WIDER FACE、COCO 和 Cityscapes Dataset 的数据集信息。
58+
用户可能还需要准备数据集并编写有关数据集的配置,可在 [Customize Datasets](../advanced_guides/customize_dataset.md) 中获取更多信息。目前 MMDetection V3.0 的配置文件已经支持 VOC、WIDERFACE、COCO、LIVS、OpenImages、DeepFashion、Objects365 和 Cityscapes Dataset 的数据集信息。
5959

6060
## 训练策略的修改
6161

@@ -64,23 +64,32 @@ model = dict(
6464
```python
6565
# 优化器
6666
# batch size 为 8 时的 lr 配置
67-
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
68-
optimizer_config = dict(grad_clip=None)
69-
# 学习策略
70-
lr_config = dict(
71-
policy='step',
72-
warmup='linear',
73-
warmup_iters=500,
74-
warmup_ratio=0.001,
75-
step=[7])
76-
# lr_config 中的 max_epochs 和 step 需要针对自定义数据集进行专门调整
77-
runner = dict(max_epochs=8)
78-
log_config = dict(interval=100)
67+
optim_wrapper = dict(optimizer=dict(lr=0.01))
68+
69+
# 学习率
70+
param_scheduler = [
71+
dict(
72+
type='LinearLR', start_factor=0.001, by_epoch=False, begin=0, end=500),
73+
dict(
74+
type='MultiStepLR',
75+
begin=0,
76+
end=8,
77+
by_epoch=True,
78+
milestones=[7],
79+
gamma=0.1)
80+
]
81+
82+
# 设置 max epoch
83+
train_cfg = dict(max_epochs=8)
84+
85+
# 设置 log config
86+
default_hooks = dict(logger=dict(interval=100)),
87+
7988
```
8089

8190
## 使用预训练模型
8291

83-
如果要使用预训练模型时,可以在 `load_from` 中查阅新的配置信息,用户需要在训练开始之前下载好需要的模型权重,从而避免在训练过程中浪费了宝贵时间。
92+
如果要使用预训练模型,可以在 `load_from` 中查阅新的配置信息,用户需要在训练开始之前下载好需要的模型权重,从而避免在训练过程中浪费了宝贵时间。
8493

8594
```python
8695
load_from = 'https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth' # noqa

docs/zh_cn/user_guides/inference.md

Lines changed: 71 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,97 +1,114 @@
1-
# 使用已有模型在标准数据集上进行推理(待更新)
1+
# 使用已有模型在标准数据集上进行推理
22

3-
推理是指使用训练好的模型来检测图像上的目标。在 MMDetection 中,一个模型被定义为一个配置文件和对应的存储在 checkpoint 文件内的模型参数的集合。
3+
MMDetection 提供了许多预训练好的检测模型,可以在 [Model Zoo](https://mmdetection.readthedocs.io/en/latest/model_zoo.html) 查看具体有哪些模型.
44

5-
首先,我们建议从 [Faster RCNN](https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn) 开始,其 [配置](https://github.com/open-mmlab/mmdetection/blob/master/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py) 文件和 [checkpoint](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) 文件在此。
5+
推理具体指使用训练好的模型来检测图像上的目标,本文将会展示具体步骤。
6+
7+
在 MMDetection 中,一个模型被定义为一个[配置文件](config.md)和对应被存储在 checkpoint 文件内的模型参数的集合。
8+
9+
首先,我们建议从 [Faster RCNN](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn) 开始,其 [配置](https://github.com/open-mmlab/mmdetection/blob/3.x/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py) 文件和 [checkpoint](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) 文件在此。
610
我们建议将 checkpoint 文件下载到 `checkpoints` 文件夹内。
711

812
## 推理的高层编程接口
913

1014
MMDetection 为在图片上推理提供了 Python 的高层编程接口。下面是建立模型和在图像或视频上进行推理的例子。
1115

1216
```python
13-
from mmdet.apis import init_detector, inference_detector
17+
import cv2
1418
import mmcv
19+
from mmcv.transforms import Compose
20+
from mmengine.utils import track_iter_progress
21+
from mmdet.registry import VISUALIZERS
22+
from mmdet.apis import init_detector, inference_detector
23+
1524

1625
# 指定模型的配置文件和 checkpoint 文件路径
17-
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
26+
config_file = 'configs/faster_rcnn/faster-rcnn_r50-fpn_1x_coco.py'
1827
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
1928

2029
# 根据配置文件和 checkpoint 文件构建模型
2130
model = init_detector(config_file, checkpoint_file, device='cuda:0')
2231

32+
# 初始化可视化工具
33+
visualizer = VISUALIZERS.build(model.cfg.visualizer)
34+
# 从 checkpoint 中加载 Dataset_meta,并将其传递给模型的 init_detector
35+
visualizer.dataset_meta = model.dataset_meta
36+
2337
# 测试单张图片并展示结果
2438
img = 'test.jpg' # 或者 img = mmcv.imread(img),这样图片仅会被读一次
2539
result = inference_detector(model, img)
26-
# 在一个新的窗口中将结果可视化
27-
model.show_result(img, result)
28-
# 或者将可视化结果保存为图片
29-
model.show_result(img, result, out_file='result.jpg')
30-
31-
# 测试视频并展示结果
32-
video = mmcv.VideoReader('video.mp4')
33-
for frame in video:
34-
result = inference_detector(model, frame)
35-
model.show_result(frame, result, wait_time=1)
36-
```
37-
38-
jupyter notebook 上的演示样例在 [demo/inference_demo.ipynb](https://github.com/open-mmlab/mmdetection/blob/master/demo/inference_demo.ipynb)
3940

40-
## 异步接口-支持 Python 3.7+
41+
# 显示结果
42+
img = mmcv.imread(img)
43+
img = mmcv.imconvert(img, 'bgr', 'rgb')
4144

42-
对于 Python 3.7+,MMDetection 也有异步接口。利用 CUDA 流,绑定 GPU 的推理代码不会阻塞 CPU,从而使得 CPU/GPU 在单线程应用中能达到更高的利用率。在推理流程中,不同数据样本的推理和不同模型的推理都能并发地运行。
4345

44-
您可以参考 `tests/async_benchmark.py` 来对比同步接口和异步接口的运行速度。
46+
visualizer.add_datasample(
47+
'result',
48+
img,
49+
data_sample=result,
50+
draw_gt=False,
51+
show=True)
4552

46-
```python
47-
import asyncio
48-
import torch
49-
from mmdet.apis import init_detector, async_inference_detector
50-
from mmdet.utils.contextmanagers import concurrent
51-
52-
async def main():
53-
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
54-
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
55-
device = 'cuda:0'
56-
model = init_detector(config_file, checkpoint=checkpoint_file, device=device)
53+
# 测试视频并展示结果
54+
# 构建测试 pipeline
55+
model.cfg.test_dataloader.dataset.pipeline[0].type = 'LoadImageFromNDArray'
56+
test_pipeline = Compose(model.cfg.test_dataloader.dataset.pipeline)
5757

58-
# 此队列用于并行推理多张图像
59-
streamqueue = asyncio.Queue()
60-
# 队列大小定义了并行的数量
61-
streamqueue_size = 3
58+
# 可视化工具在第33行和35行已经初完成了初始化,如果直接在一个 jupyter nodebook 中运行这个 demo,
59+
# 这里则不需要再创建一个可视化工具了。
60+
# 初始化可视化工具
61+
visualizer = VISUALIZERS.build(model.cfg.visualizer)
62+
# 从 checkpoint 中加载 Dataset_meta,并将其传递给模型的 init_detector
63+
visualizer.dataset_meta = model.dataset_meta
6264

63-
for _ in range(streamqueue_size):
64-
streamqueue.put_nowait(torch.cuda.Stream(device=device))
65+
# 显示间隔 (ms), 0 表示暂停
66+
wait_time = 1
6567

66-
# 测试单张图片并展示结果
67-
img = 'test.jpg' # or 或者 img = mmcv.imread(img),这样图片仅会被读一次
68+
video = mmcv.VideoReader('video.mp4')
6869

69-
async with concurrent(streamqueue):
70-
result = await async_inference_detector(model, img)
70+
cv2.namedWindow('video', 0)
7171

72-
# 在一个新的窗口中将结果可视化
73-
model.show_result(img, result)
74-
# 或者将可视化结果保存为图片
75-
model.show_result(img, result, out_file='result.jpg')
72+
for frame in track_iter_progress(video_reader):
73+
result = inference_detector(model, frame, test_pipeline=test_pipeline)
74+
visualizer.add_datasample(
75+
name='video',
76+
image=frame,
77+
data_sample=result,
78+
draw_gt=False,
79+
show=False)
80+
frame = visualizer.get_image()
81+
mmcv.imshow(frame, 'video', wait_time)
7682

83+
cv2.destroyAllWindows()
84+
```
7785

78-
asyncio.run(main())
86+
Jupyter notebook 上的演示样例在 [demo/inference_demo.ipynb](https://github.com/open-mmlab/mmdetection/blob/3.x/demo/inference_demo.ipynb)
7987

80-
```
88+
注意: `inference_detector` 目前仅支持单张图片的推理。
8189

8290
## 演示样例
8391

84-
我们还提供了三个演示脚本,它们是使用高层编程接口实现的。 [源码在此](https://github.com/open-mmlab/mmdetection/tree/master/demo)
92+
我们还提供了三个演示脚本,它们是使用高层编程接口实现的。 [源码在此](https://github.com/open-mmlab/mmdetection/blob/3.x/demo)
8593

8694
### 图片样例
8795

88-
这是在单张图片上进行推理的脚本,
96+
这是在单张图片上进行推理的脚本。
97+
98+
```shell
99+
python demo/image_demo.py \
100+
${IMAGE_FILE} \
101+
${CONFIG_FILE} \
102+
[--weights ${WEIGHTS}] \
103+
[--device ${GPU_ID}] \
104+
[--pred-score-thr ${SCORE_THR}]
105+
```
89106

90107
运行样例:
91108

92109
```shell
93110
python demo/image_demo.py demo/demo.jpg \
94-
configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
111+
configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
95112
--weights checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
96113
--device cpu
97114
```
@@ -113,7 +130,7 @@ python demo/webcam_demo.py \
113130

114131
```shell
115132
python demo/webcam_demo.py \
116-
configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
133+
configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
117134
checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
118135
```
119136

@@ -137,7 +154,7 @@ python demo/video_demo.py \
137154

138155
```shell
139156
python demo/video_demo.py demo/demo.mp4 \
140-
configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
157+
configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
141158
checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
142159
--out result.mp4
143160
```
@@ -164,7 +181,7 @@ python demo/video_gpuaccel_demo.py \
164181

165182
```shell
166183
python demo/video_gpuaccel_demo.py demo/demo.mp4 \
167-
configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
184+
configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
168185
checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
169186
--nvdecode --out result.mp4
170187
```

0 commit comments

Comments
 (0)