You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Flex Attention with batch size 16 (Torch implementation) fails on BMG:
https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/17038905956/job/48297736887.
This PR skips running batch size 16 on BMG.
In order to track the performance of Flex Attention with more than 1
batch size, this PR adds a run with batch size 4, which can be removed
when batch size 16 is fixed.
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
0 commit comments