Skip to content
This repository was archived by the owner on Nov 21, 2023. It is now read-only.
This repository was archived by the owner on Nov 21, 2023. It is now read-only.

Mask FPN training question: best way to reduce memory need #31

@jwnsu

Description

@jwnsu

Trained mask fpn rcnn (Resnet50) with 4 GPUs (11GB memory), it ran out of memory after a few iterations. The training went fine after reduce image scale size from 800x1333 to 600x1000.

What's the best way to reduce memory need without hurting accuracy? Reduce image size will reduce accuracy by ~1 percentage point. How about BATCH_SIZE_PER_IM? Currently it's set at 512, ok to set it to 256?

Thx.

ps: GPUs memory usage (with reduced scale 600x1000):

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     28227      C   python2                                     8836MiB |
|    1     28227      C   python2                                     8577MiB |
|    2     28227      C   python2                                     8313MiB |
|    3     28227      C   python2                                     8405MiB |
+-----------------------------------------------------------------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions