-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Replace FindCUDA with FindCUDAToolkit (fix #2833) #3090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Last time i worked on #2912 i got everything working except for pkg-config. Basically, I tried to use cmake imported targets everywhere (i.e. using things like |
Sorry I should have looked into #2912 more closely, I didn't realise fixing the FindCUDA issue was related. I found some discussion online and a suggestion at the end of this thread, for PcFileGenerator led me to this discussion. From that I gather that this is a transitive dependency issue, I think I've run into something similar previously but couldn't solve it. |
Yep. It's also not dlib's job to fix/enhance cmake's pkg-config capability. In my view dlib should just drop pkg-config support then this all goes away. #2912 could be revived which upgraded to FindCUDAToolkit amongst other things. |
I've had a look at this and the CUDA libraries are added to the dlib_needed_private_libraries rather than dlib_needed_public_libraries and the latter are added to the dlib-1.pc file. Does that mean this change can be made without impacting pkg-config? |
Yeah not sure how many people use pkg-config. I don't, but it seemed popular for a while. Not sure if that's still the case or not. |
Sorry this took so long, I ended up using act and a ported version of one of the Github actions to do some testing. |
tools/python/dlib/__init__.py.in
Outdated
def add_lib_to_dll_path(path): | ||
""" On windows you must call os.add_dll_directory() to allow linking to external DLLs. See | ||
https://docs.python.org/3.8/whatsnew/3.8.html#bpo-36085-whatsnew. This function adds the folder | ||
containing path to the dll search path. | ||
""" | ||
try: | ||
import os | ||
os.add_dll_directory(os.path.join(os.path.dirname(path), '../../bin')) | ||
except (AttributeError,KeyError,FileNotFoundError): | ||
pass | ||
|
||
if '@DLIB_USE_CUDA@' == 'ON': | ||
add_lib_to_dll_path('@cudnn@') | ||
add_lib_to_dll_path('@CUDA_CUDART_LIBRARY@') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to imagine we still need this for windows users. Did you test this stuff on windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don’t need it for conda-forge, I’m not sure for vanilla builds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a few false starts I think I've finally caught up with this discussion enough to provide some useful input. Looking to test this with windows (VS2022/CUDA12.0/CUDNN9.2) I've found I can build and run the tests for C++ (some of the tests fail). I'm not sure of the cause, I've found replicating the test environment difficult and I think some tests may be failing due to organisation restrictions. I installed Visual Studio 2015, but I don't know what version of CUDA this was working with previously.
I later discovered the batch file to compile and test the python/C++ builds, the python version does not compile. This seems to be a mismatch between the python interface (which is statically linked) and the CUDA related libraries which are dynamic.
should have probably started by checking here as to why CUDA support wasn't compiling... #3090 confirmed working on: |
I'm still testing this out. But I ran some stuff on various versions of things in windows and linux and fixed a few details in the PR that were not working. I'll go over it more this weekend. @arrufat you want to try this out too? See if it works for you. |
I did a quick test and it seems to work for me: $ cmake -B build -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DDLIB_USE_CUDA_COMPUTE_CAPABILITIES=75
-- Found CUDAToolkit: /opt/cuda/targets/x86_64-linux/include (found version "12.9.86")
-- The CUDA compiler identification is NVIDIA 12.9.86 with host compiler GNU 14.3.1
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /opt/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDNN: /usr/lib/libcudnn.so
-- Enabling CUDA support for dlib. DLIB WILL USE CUDA, compute capabilities: But the CUDA capabilities thing doesn't seem to work, I even get a warning nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). |
Looking at the code it looks like Instead |
Also, are we happy using imported targets now like the following? That breaks pkgconfig. is that ok? if so, i can revive my old PR at some point |
Yeah the cuda compute capabilities is disabled in this PR. I found at least one case where it just doesn't and I got an error from nvcc about it not liking it being set to "all". But mostly it worked. I'll play with it more this weekend. @pfeatherstone yeah I give up on pkgconfig. I don't know if anyone even cares about it, but it's been a barrier for too long to cleaning up cmake stuff. If someone really wants it to work we will find out when they complain about it and they can figure something out :) Or maybe the cmake community also doesn't care enough and pkgconfig is just not something people care about anymore. |
yeah i agree. In which case i can probably revive #2912 at some point. |
Urg. I Just can't get any remotely recent version of ubuntu or cuda to work with my GPU. The GPU I have at home is an old 1080ti. I've now reinstalled ubuntu 22.04 and 24.04 several times with various combinations of cuda toolkits, drivers, and cudnn. The docs make it seem like (sorta, it's deprecated and not the recommended thing to use though) a 1080ti should still work with cudnn 9 and cuda 12 variants but I have not found a single combination where that is so. I just tried another combination and again got errors from cudnn. Gave up and bought a RTX 3060 to test this stuff with, since the thing holding me back from getting this PR landed is not being able to test this with newer software. New GPU is in the mail now. |
I'd need to check but I'm pretty sure we have a machine at work with 2x 1080 TIs. I'll check it's still working |
Yeah what configurations are all you using? I.e, what cuda and cudnn versions and gpus? My 1080ti worked fine on older versions of cuda. But we are talking several years old though. So I wiped it all and started testing with new versions to test this PR here but couldn't get any recent (I.e. cuda 12 and newer) working with it. Although I think cuda 12.4 probably works, but that isn't supported on Ubuntu 24.04. And when I tried with Ubuntu 22.04 and cuda 12.4 I ran into other problems. Maybe I could somehow get it working. But, I'm trying to test that these cmake scrips actually work sensibly with recent tooling and the setups that nvidia actually supports. |
Setup:
Doesn't work. CUDA installed with:
CMake output:
My previous post was on Arch with RTX 5000, with cuda installed with |
In my old PR i did the following: Specifically i did:
|
Ah yeah, you need to add I just pushed @pfeatherstone's suggestion about
to this PR though and now setting the |
@pfeatherstone thanks, now I am getting this while building the
|
It compiles for me now. |
Can i suggest you do the following:
It means you don't have to do things like |
@pfeatherstone great, now it builds but it fails to link (it can't find any Here's my diff: diff --git a/dlib/CMakeLists.txt b/dlib/CMakeLists.txt
index 50b40c3c..9b090610 100644
--- a/dlib/CMakeLists.txt
+++ b/dlib/CMakeLists.txt
@@ -29,6 +29,7 @@ endif()
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_CURRENT_SOURCE_DIR}/cmake_utils)
include(cmake_utils/set_compiler_specific_options.cmake)
+include(CMakeDependentOption)
# Adhere to GNU filesystem layout conventions
@@ -640,23 +641,12 @@ if (NOT TARGET dlib)
if (DLIB_USE_CUDA)
find_package(CUDAToolkit)
+ cmake_dependent_option(DLIB_USE_CUDA "Enable CUDA support" ON "CUDAToolkit_FOUND;CUDAToolkit_NVCC_EXECUTABLE;CUDNN_FOUND" OFF)
if (CUDAToolkit_FOUND AND CUDAToolkit_NVCC_EXECUTABLE)
- set(CMAKE_CUDA_COMPILER ${CUDAToolkit_NVCC_EXECUTABLE})
- enable_language(CUDA)
-
find_package(CUDNN)
if(CUDNN_FOUND)
- set(source_files ${source_files}
- cuda/cuda_dlib.cu
- cuda/cudnn_dlibapi.cpp
- cuda/cublas_dlibapi.cpp
- cuda/cusolver_dlibapi.cu
- cuda/curand_dlibapi.cpp
- cuda/cuda_data_ptr.cpp
- cuda/gpu_data.cpp
- )
list (APPEND dlib_needed_private_libraries CUDA::cublas)
list (APPEND dlib_needed_private_libraries ${CUDNN_LIBRARY_PATH})
list (APPEND dlib_needed_private_libraries CUDA::curand)
@@ -731,6 +721,17 @@ if (NOT TARGET dlib)
add_library(dlib ${source_files})
if(DLIB_USE_CUDA)
+ set(CMAKE_CUDA_COMPILER ${CUDAToolkit_NVCC_EXECUTABLE})
+ enable_language(CUDA)
+ set(source_files ${source_files}
+ cuda/cuda_dlib.cu
+ cuda/cudnn_dlibapi.cpp
+ cuda/cublas_dlibapi.cpp
+ cuda/cusolver_dlibapi.cu
+ cuda/curand_dlibapi.cpp
+ cuda/cuda_data_ptr.cpp
+ cuda/gpu_data.cpp
+ )
set_property(TARGET dlib PROPERTY CUDA_ARCHITECTURES all)
endif() |
No you don't want |
Can i also suggest you check the install targets. So check that if you run |
….txt in test_for_cudnn. Now working on Windows 10 with default install paths.
…, do we still need this?
…, do we still need this?
…der if there's any easy way to run that?
@pfeatherstone @arrufat try this all now. I went through a bunch of workflows and tested things. Near as I can tell it's all ready to go and all works now. |
@davisking This works great with RTX 5000 and RTX 3090. Thank you! |
Attempting to fix #2833 the FindCUDA issue (Policy CMP0146 is not set: The FindCUDA module is removed.)
Working from #3087 and with some help from Tobias. I have this working on Fedora Linux and Windows 10.
I've more recently found #3004 which suggests (referencing #2912) that this may not be possible due to findpkg issues. I've not looked at that but I have been able to update test_for_cuda and test_for_cudnn. I have compiled and run the test as suggested in #3004 and it seems to compile and run.