You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Having power infer kernels compatible with sparse weight cache would open up all the models in sparse transformers to support weight lazy loading and having faster inference kernels for skipMLP