Skip to content

Conversation

dcmvdbekerom
Copy link

This PR implements indirect dispatch for the Vulkan backend, which allows to dynamically update the number of batches.

I works through 4 user settable variables in the configuration struct:

  • uint indrectDispatch: , 0 for direct disptach, 1 for indirect forward dispatch, 2 for indirect inverse dispatch, 3 for both indirect dispatch.
  • VkBuffer indirectBuffer: vulkan buffer to store the indirect workgroup sizes.
  • uint indirectBufferOffset: offset in the indirectBuffer where the workgroup sizes are found. This can be useful if multiple VkFFT apps use the same indirectBuffer.
  • uint* indirectHostPointer: pointer to uint array with a host copy of the workgroup sizes. This should be a Vulkan initialized staging buffer. VkFFT will fill these with the required sizes during the dispatch phase. The user can then update the workgroup sizes and transfer the staging buffer to the indirectBuffer. This should all be handled by the user whenever the workgroup size needs to be updated.

The structure of the indirectBuffer is (x_size, y_size, z_size, id) for every dispatch. The number of dispatches per FFT is typically between 1-4. id is an uint that is used to identify the axis that needs to be updated; 0 for x, 1 for y, 2 for z. This is set by VkFFT during the dispatch phase and can be read by the user to determine which workgroup size needs to be modified.

A few points worthy of note:

  • I removed the explicit size in the shader glsl code for the input/output buffers, which is allowed in Vulkan and makes it possible to reuse the shaders for larger sizes. Please check if you are okay with this.
  • I only tested R2C and C2R transforms for easy sizes (products of 2, 3, 5, 7, 11). It will probably not work for Bluestein and DCT in its current form.
  • I call VkFFT from python so I have not included a C++ demo, but I could add it if required.
  • I have not merged the development branch into this PR yet because it caused some issues for me. I will do this when I figured out the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant