Thanks for the good plugin to handle GPU resources!
There is a small bug:
"custom - Define which GPU(s) will be visible in container" setting fails to reserve multiple GPUs.
This happens at least when using pipelines.
Pipeline configuration used:
withRemoteDocker(debug: true, main:
image(configItemList: [runtime(dockerRuntime: 'nvidia'),
gpus(nvidiaDevices: 'custom', nvidiaDevicesCustom: "0,1,2,3")], forcePull: false, image: 'tensorflow/tensorflow:latest-gpu',
volumes: []), removeContainers: false, sideContainers: [], workspaceOverride: '/ws') {
sh("""nvidia-smi""")
}
On Docker level generated command is:
docker run -t -d --network bridge --entrypoint /bin/sh --workdir /ws -v /data/ws:/ws -v /tmp:/tmp -v /data/ws@tmp:/data/ws@tmp --gpus device=0,1,2,3 tensorflow/tensorflow:latest-gpu
Error:
docker: Error response from daemon: cannot set both Count and DeviceIDs on device request.
Root cause is probably only lack of extra quoting that is required:
NVIDIA/nvidia-docker#1026 (--gpus '"device=0,1,2,3"')