Skip to content

Conversation

pfeatherstone
Copy link
Contributor

This PR caches MKL plans. This substantially improves performance when using MKL sequential and MKL TBB. Not benchmarked with MKL RT but I expect the same results.

pfeatherstone and others added 10 commits August 10, 2025 21:08
-  Make sure DLIB_USE_MKL_SEQUENTIAL and DLIB_USE_MKL_WITH_TBB are added to the list of preprocessor definitions when appropriate
- Add path to libtbb. You don't want to mix libraries in /opt/intel/oneapi/... and /usr/lib/x86...
@pfeatherstone
Copy link
Contributor Author

pfeatherstone commented Aug 13, 2025

@davisking Ready for an initial review. Linking to libmkl_rt.so was broken. Mainly because by default it loads libmkl_intel_thread.so at runtime using dlopen which in turns requires openmp, specifically the intel version. If that's not in your link libraries it breaks at runtime. I think this behaviour is not desirable. What i found is if you're also linking to ffmpeg or other libraries, you're inadvertently linking to gomp. So you might not see the error. Though libmkl_intel_thread.so requires the intel openmp, not the GNU. So it was actually using the "wrong" symbols. So i changed the cmake BLAS stuff to link to one of

  • MKL sequential
  • MKL tbb
  • MLL thread (openmp)
    With MKL thread you explicitly link to openmp at link-time. It doesn't use dlopen. I think that's better.
    This required me to refactor the find_blas.cmake script which was a mess to be honest. I've also added some jobs to Github Actions to test all three version of MKL. The build_cpp.yml script needs some serious refactoring but that can be for another PR.

For the BLAS stuff i'm now using cmake_dependent_option which simplifies the super messy logic in the CMakeLists.txt script.

@pfeatherstone
Copy link
Contributor Author

@davisking Do you mind having a look at this? Or are you wanting to merge the CUDA stuff first? I would rather get this in before i touch cmake revamping stuff again as I would be duplicating certain things in here already.

@pfeatherstone
Copy link
Contributor Author

pfeatherstone commented Aug 26, 2025

@davisking hold fire on merging. I haven't got the windows jobs working correctly with openblas and MKL. vcpkg takes too long to install those deps and haven't figured out how to use it correctly with cmake.

@pfeatherstone
Copy link
Contributor Author

pfeatherstone commented Sep 1, 2025

Right, i'm gonna need some help. I don't have a windows machine i can use for dev. I'm giving up trying to debug the windows builds simply by looking at github actions logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants