-
Notifications
You must be signed in to change notification settings - Fork 679
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add NEON implementation of FloatOrHalfToFusedNBitRowwiseQuantizedSBHalf
cla signed
fb-exported
meta-exported
#5115
opened Nov 11, 2025 by
Nicoshev
Loading…
Add support of 64 headDim
cla signed
fb-exported
meta-exported
#5114
opened Nov 11, 2025 by
Aya-ZIbra
Loading…
Add CUDAGuard to ensure correct device
cla signed
fb-exported
meta-exported
#5113
opened Nov 11, 2025 by
cthi
Loading…
Support fp16 for cutlass grouped GEMM
cla signed
fb-exported
meta-exported
#5111
opened Nov 11, 2025 by
jianyuh
Loading…
accelerate permute_1D_data_kernel
cla signed
fb-exported
meta-exported
#5110
opened Nov 10, 2025 by
royren622
Loading…
Fix uncoalesced global memory access in decode attention bf16 kernel
cla signed
fb-exported
meta-exported
#5109
opened Nov 10, 2025 by
Alkaid-Benetnash
Loading…
Fix cutlass_blackwell_fmha_custom_op and add comprehensive FMHA tests
cla signed
fb-exported
meta-exported
#5108
opened Nov 10, 2025 by
jsisometa
Loading…
Add
get_unique_indices on CPU
cla signed
fb-exported
meta-exported
#5096
opened Nov 6, 2025 by
gchalump
Loading…
Fix NAN for the prediction (#2096)
cla signed
fb-exported
meta-exported
#5088
opened Nov 4, 2025 by
quhang
Loading…
Several kernel optimization from aiter team
cla signed
module: rocm
#5074
opened Oct 31, 2025 by
Bernard-Liu
•
Draft
log all table names in TBE
cla signed
fb-exported
meta-exported
#5071
opened Oct 30, 2025 by
ashuaibi7
Loading…
Efficient Causal/Local scheduling
cla signed
fb-exported
meta-exported
#5066
opened Oct 28, 2025 by
Aya-ZIbra
Loading…
Adding Kineto support to bench:sparse_ops
cla signed
fb-exported
meta-exported
#5060
opened Oct 27, 2025 by
gchalump
Loading…
Abstract out sharable interface of Dram KV wrapper
cla signed
fb-exported
meta-exported
#5057
opened Oct 27, 2025 by
tomlintbl
Loading…
Changing Backend Tensor initialization
cla signed
fb-exported
meta-exported
#5056
opened Oct 25, 2025 by
Raahul46
Loading…
Changing Backend Tensor initialization
cla signed
fb-exported
meta-exported
#5055
opened Oct 25, 2025 by
Raahul46
Loading…
Update embedding_forward_quantized_cpu_template.cpp to use initialized output memory instead of uninitialized
cla signed
fb-exported
meta-exported
#5054
opened Oct 25, 2025 by
quhang
Loading…
use reorder_batched_ad_indices_vec kernel for rebatching
cla signed
fb-exported
meta-exported
#5049
opened Oct 24, 2025 by
royren622
Loading…
Remove AVX compilation on aarch64 builds
cla signed
fb-exported
meta-exported
#5045
opened Oct 23, 2025 by
Nicoshev
Loading…
fix potential overflow will multiple attempts (#2054)
cla signed
fb-exported
meta-exported
#5044
opened Oct 23, 2025 by
emlin
Loading…
PyTorch version compatibility check in
_load_library
cla signed
fb-exported
meta-exported
#5041
opened Oct 21, 2025 by
gchalump
Loading…
Fix build CI bug for release
cla signed
fb-exported
meta-exported
#5035
opened Oct 21, 2025 by
spcyppt
Loading…
report scuba events for detailed sparse static memory info
cla signed
fb-exported
meta-exported
#5029
opened Oct 20, 2025 by
ashuaibi7
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.