Skip to content

Commit 89425e0

Browse files
authored
Merge pull request #134 from GianlucaFuwa/master
Fix `threadlocal` type-inferrability and add `reduction` functionality
2 parents 6b1eaeb + ab511e8 commit 89425e0

File tree

5 files changed

+494
-110
lines changed

5 files changed

+494
-110
lines changed

README.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -197,8 +197,6 @@ BenchmarkTools.Trial: 10000 samples with 10 evaluations.
197197

198198
### Local per-thread storage (`threadlocal`)
199199

200-
**Warning: this feature is likely broken!**
201-
202200
You also can define local storage for each thread, providing a vector containing each of the local storages at the end.
203201

204202
```julia
@@ -234,6 +232,25 @@ julia> let
234232
Float16[83.0, 90.0, 27.0, 65.0]
235233
```
236234

235+
### `reduction`
236+
The `reduction` keyword enables reduction of an already initialized `isbits` variable with certain supported associative operations (see [docs](https://JuliaSIMD.github.io/Polyester.jl/stable)), such that the transition from serialized code is as simple as adding the `@batch` macro. Contrary to `threadlocal` this does not incur any additional allocations
237+
238+
```julia
239+
julia> function batch_reduction()
240+
y1 = 0
241+
y2 = 1
242+
@batch reduction=((+, y1), (*, y2)) for i in 1:9
243+
y1 += i
244+
y2 *= i
245+
end
246+
y1, y2
247+
end
248+
julia> batch_reduction()
249+
(45, 362880)
250+
julia> @allocated batch_reduction()
251+
0
252+
```
253+
237254
## Disabling Polyester threads
238255

239256
When running many repetitions of a Polyester-multithreaded function (e.g. in an embarrassingly parallel problem that repeatedly executes a small already Polyester-multithreaded function), it can be beneficial to disable Polyester (the inner multithreaded loop) and multithread only at the outer level (e.g. with `Base.Threads`). This can be done with the `disable_polyester_threads` context manager. In the expandable section below you can see examples with benchmarks.

src/Polyester.jl

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ end
66
using ThreadingUtilities
77
import StaticArrayInterface
88
const ArrayInterface = StaticArrayInterface
9+
using Base.Cartesian: @nexprs
910
using StaticArrayInterface: static_length, static_step, static_first, static_size
1011
using StrideArraysCore: object_and_preserve
1112
using ManualMemory: Reference
@@ -23,6 +24,15 @@ using CPUSummary: num_cores
2324

2425
export batch, @batch, disable_polyester_threads
2526

27+
const SUPPORTED_REDUCE_OPS = (:+, :*, :min, :max, :&, :|)
28+
initializer(::typeof(+), ::T) where {T} = zero(T)
29+
initializer(::typeof(+), ::Bool) = zero(Int)
30+
initializer(::typeof(*), ::T) where {T} = one(T)
31+
initializer(::typeof(min), ::T) where {T} = typemax(T)
32+
initializer(::typeof(max), ::T) where {T} = typemin(T)
33+
initializer(::typeof(&), ::Bool) = true
34+
initializer(::typeof(|), ::Bool) = false
35+
2636
include("batch.jl")
2737
include("closure.jl")
2838

@@ -38,10 +48,4 @@ function reset_threads!()
3848
foreach(ThreadingUtilities.checktask, eachindex(ThreadingUtilities.TASKS))
3949
return nothing
4050
end
41-
42-
# y = rand(1)
43-
# x = rand(1)
44-
# @batch for i ∈ eachindex(y,x)
45-
# y[i] = sin(x[i])
46-
# end
4751
end

0 commit comments

Comments
 (0)