Skip to content
This repository was archived by the owner on Nov 21, 2023. It is now read-only.
This repository was archived by the owner on Nov 21, 2023. It is now read-only.

tools/infer_simple.py fail #35

@oreilc16

Description

@oreilc16

I downloaded a caffe2 docker file from this link: https://github.com/caffe2/caffe2/blob/master/docker/ubuntu-16.04-cuda8-cudnn6-all-options/Dockerfile

then ran the following to build

# Use the latest Caffe2 master
sed -i -e 's/ --branch v0.8.1//g' Dockerfile
docker build -t caffe2:cuda8-cudnn6-all-options .

I then cloned https://github.com/facebookresearch/Detectron/ and built a Detectron image:

cd $DETECTRON/docker
docker build -t detectron:c2-cuda8-cudnn6 .

I ran this docker using the following commands with my devices added:

nvidia-docker run --rm -it --device=/dev/nvidiactl --device=/dev/nvidia-uvm --device=/dev/nvidia0  detectron:c2-cuda8-cudnn6

Inside my docker I ran this (from GETTING_STARTED.md) to verify I could run inference on a directory of image files :

python2 tools/infer_simple.py \
    --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
    --output-dir /tmp/detectron-visualizations \
    --image-ext jpg \
    --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
    demo

The fail I got was this:

E0125 13:59:11.822402    20 init_intrinsics_check.cc:54] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0125 13:59:11.822424    20 init_intrinsics_check.cc:54] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0125 13:59:11.822428    20 init_intrinsics_check.cc:54] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
INFO io.py:  67: Downloading remote file https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-101.pkl to /tmp/detectron-download-cache/ImageNetPretrained/MSRA/R-101.pkl
  [------------------------------------------------------------] 0.0% of 170.2MB  [------------------------------------------------------------] 0.0% of 170.2MB  [---------------------------------------------
...
...
...
I0125 14:02:33.693230    20 net_dag.cc:61] Number of parallel execution chains 63 Number of operators = 402
I0125 14:02:33.720326    20 net_dag_utils.cc:118] Operator graph pruning prior to chain compute took: 0.000215514 secs
I0125 14:02:33.720594    20 net_dag.cc:61] Number of parallel execution chains 30 Number of operators = 358
I0125 14:02:33.722873    20 net_dag_utils.cc:118] Operator graph pruning prior to chain compute took: 1.6725e-05 secs
I0125 14:02:33.722909    20 net_dag.cc:61] Number of parallel execution chains 5 Number of operators = 18
INFO infer_simple.py: 111: Processing demo/17790319373_bd19b24cfc_k.jpg -> /tmp/detectron-visualizations/17790319373_bd19b24cfc_k.jpg.pdf
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
  what():  [enforce fail at context_gpu.h:170] . Encountered CUDA error: invalid device function Error from operator: 
input: "gpu_0/res2_0_branch2c_bn" input: "gpu_0/res2_0_branch1_bn" output: "gpu_0/res2_0_branch2c_bn" name: "" type: "Sum" device_option { device_type: 1 cuda_gpu_id: 0 }
*** Aborted at 1516888954 (unix time) try "date -d @1516888954" if you are using GNU date ***
PC: @     0x7f8fbb2c0428 gsignal
*** SIGABRT (@0x14) received by PID 20 (TID 0x7f8f656a6700) from PID 20; stack trace: ***
    @     0x7f8fbb2c04b0 (unknown)
    @     0x7f8fbb2c0428 gsignal
    @     0x7f8fbb2c202a abort
    @     0x7f8fb523784d __gnu_cxx::__verbose_terminate_handler()
    @     0x7f8fb52356b6 (unknown)
    @     0x7f8fb5235701 std::terminate()
    @     0x7f8fb5260d38 (unknown)
    @     0x7f8fbb65c6ba start_thread
    @     0x7f8fbb39241d clone
    @                0x0 (unknown)
Aborted (core dumped)

Any help would be much appreciated thank you

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions