This repository was archived by the owner on Nov 21, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
This repository was archived by the owner on Nov 21, 2023. It is now read-only.
Running out of memory on a 4GB card #21
Copy link
Copy link
Open
Labels
Description
I'm trying to run Faster-RCNN on a Nvidia GTX 1050Ti, but I'm running out of memory. Nvidia-smi says that about 170MB are already in use, but does Faster-RCNN really use 3.8GB of VRAM to process an image?
I tried Mask-RCNN too (the model in the getting started tutorial) and got about 4 images in (5 if I closed my browser) before it crashed.
Is this a bug or does it really just need more than 4GB of memory?
INFO infer_simple.py: 111: Processing demo/18124840932_e42b3e377c_k.jpg -> /home/px046/prog/Detectron/output/18124840932_e42b3e377c_k.jpg.pdf
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at blob.h:94] IsType<T>(). wrong type for the Blob instance. Blob contains nullptr (uninitialized) while caller expects caffe2::Tensor<caffe2::CUDAContext> .
Offending Blob name: gpu_0/conv_rpn_w.
Error from operator:
input: "gpu_0/res4_5_sum" input: "gpu_0/conv_rpn_w" input: "gpu_0/conv_rpn_b" output: "gpu_0/conv_rpn" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "exhaustive_search" i: 0 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "stride" i: 1 } device_option { device_type: 1 cuda_gpu_id: 0 } engine: "CUDNN"
*** Aborted at 1516787658 (unix time) try "date -d @1516787658" if you are using GNU date ***
PC: @ 0x7f08de455428 gsignal
*** SIGABRT (@0x3e800000932) received by PID 2354 (TID 0x7f087cda9700) from PID 2354; stack trace: ***
@ 0x7f08de4554b0 (unknown)
@ 0x7f08de455428 gsignal
@ 0x7f08de45702a abort
@ 0x7f08d187db39 __gnu_cxx::__verbose_terminate_handler()
@ 0x7f08d187c1fb __cxxabiv1::__terminate()
@ 0x7f08d187c234 std::terminate()
@ 0x7f08d1897c8a execute_native_thread_routine_compat
@ 0x7f08def016ba start_thread
@ 0x7f08de52741d clone
@ 0x0 (unknown)
Aborted (core dumped)