You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add mixed precision LU factorization methods (#746)
* Add mixed precision LU factorization methods
This commit introduces four new mixed precision LU factorization algorithms
that perform computations in Float32 while maintaining Float64 interfaces,
providing significant performance improvements for memory-bandwidth limited
problems.
New factorization methods:
- CUDAOffload32MixedLUFactorization: GPU-accelerated mixed precision for NVIDIA GPUs
- MetalOffload32MixedLUFactorization: GPU-accelerated mixed precision for Apple Metal
- MKL32MixedLUFactorization: CPU-based mixed precision using Intel MKL
- AppleAccelerate32MixedLUFactorization: CPU-based mixed precision using Apple Accelerate
Key features:
- Transparent Float64 to Float32 conversion for factorization
- Support for both real and complex matrices
- Up to 2x speedup for large, well-conditioned matrices
- Maintains reasonable accuracy while reducing memory bandwidth requirements
The implementations handle precision conversion internally, making them
easy to use as drop-in replacements for standard LU factorization when
reduced precision is acceptable.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add tests and documentation for mixed precision methods
- Added mixed precision tests to the Core test group in runtests.jl
- Added documentation for all four mixed precision methods in docs
- Added section explaining when to use mixed precision methods
- Documentation includes performance characteristics and use cases
The tests now run as part of the standard test suite, and the
documentation provides clear guidance on when these methods are
beneficial (large well-conditioned problems with memory bandwidth
bottlenecks).
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update docs/src/solvers/solvers.md
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Christopher Rackauckas <accounts@chrisrackauckas.com>
0 commit comments