Skip to content

v0.4.3

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 23 Jul 19:26
· 3 commits to main since this release
c146374

AcceleratedKernels v0.4.3

Diff since v0.4.2

  • Made ScanPrefixes the default accumulate / cumsum / cumprod algorithm. It is almost always faster on real-world data than DecoupledLookback, and doesn't depend on cross-block communication (even though theoretically DecoupledLookback has better asymptotic scalability).
  • Prepared AcceleratedKernels for the future PoCL backend becoming the KernelAbstractions CPU default backend; the Threads-based algorithms will remain the defaults until PoCL ones become faster.
  • A lot of housekeeping.

Merged pull requests:

Closed issues:

  • Port over GPUArrays neutral_element fixes (#51)