-
Notifications
You must be signed in to change notification settings - Fork 1.6k
SME1 based direct kernel (with alpha and beta) for cblas_sgemm level 3 #5380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SME1 based direct kernel (with alpha and beta) for cblas_sgemm level 3 #5380
Conversation
Thanks - this looks like an interesting addition (if not competitor) to the present sgemm_direct kernel. The CI error suggests that perhaps the entire kernel needs to be guarded with the __ARM_FEATURE_SME define (or the ...2VLx2VL function should have an empty alternative for when that feature macro is undefined) ? |
38540ea
to
442273d
Compare
Hi Martin. Thanks for your quick review. It’s very helpful. I have updated the PR to address the two issues. (Add empty alternative for when feature macro is undefined. Modify the copyright statement) |
442273d
to
366deb1
Compare
seems now we have AppleClang acting up over things in its own arm_sme header file |
366deb1
to
831c4e3
Compare
From the error log, I understand that error coming from clang-15.0.0 not supporting 'arm_streaming_' attributes. I tried to reproduce locally but found sme isn't supported in clang-15. For mitigation, I am adding extra guard along with __ARM_FEATURE_SME. I have pushed below update. If the above doesn't work, I can think of explicitly checking clang version. |
831c4e3
to
70ef30c
Compare
70ef30c
to
eae0abf
Compare
@martin-frbg |
Both are unrelated - the loongarch job ran out of time and the IBM-Z build on Jenkins failed to access github (and still does today) |
Thanks! |
This PR contains support for sgemm_direct kernel (with support for alpha and beta) based on SME1 architecture.