-
Notifications
You must be signed in to change notification settings - Fork 15
Small performance optimizations #969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f228950 to
f7964d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors smoothing kernel normalization factors and related computations to improve GPU performance and code clarity. The changes focus on simplifying arithmetic expressions to reduce instructions and improve readability.
Key Changes:
- Simplified normalization factor expressions by consolidating divisions (e.g.,
a / b / c→a / (b * c)) - Optimized
v_maxcomputation in particle shifting to compute squared magnitude first, then take square root - Simplified kernel derivative formulas for Wendland kernels by algebraically reducing expressions
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/schemes/fluid/shifting_techniques.jl | Optimized v_max calculation to compute maximum of squared velocities before taking square root |
| src/general/smoothing_kernels.jl | Simplified normalization factors and kernel derivatives across multiple kernel types (Schoenberg, Wendland, Poly6) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/run-gpu-tests |
This PR contains two small performance optimizations. The first is an algebraic simplification of the derivatives of the Wendland kernels and the normalization factors:
main:
With
resultsimplified algebraically:With
normalization_factorsimplified to7 / (pi * h^2 * 4):Interestingly, this difference is not measurable when benchmarking only
kernel_deriv.The second is a small optimization of the computation of
v_max(apparently only relevant on the CPU).