-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Describe the feature request
Since quantization techniques are orthogonal to sparsity, we should be able to leverage the benefits of both and stack them together.
Describe the solution you'd like
We have similar dtype templates in cuda which we need to replicate for CPU and instruction sets like AVX
template <>
__global__ void sparse_mlp_combined_cuda_kernel<float>(...)
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Type
Projects
Status
Planning