Fix NCCL broadcast error on CPU tensors in distributed inference #257

Pratham-Nayak1 · 2025-10-01T01:32:29Z

This PR fixes a runtime error in distributed inference with the NCCL backend:
RuntimeError: No backend type associated with device type cpu

Root Cause:
When using NCCL, collective operations require CUDA tensors. The code attempted to run:
dist.broadcast(length_tensor, src=0)
while length_tensor was on CPU. This caused the runtime error on non-zero ranks.

Fix:
Before broadcasting, the small metadata tensor is moved to the local CUDA device if dist.get_backend() == "nccl". After the broadcast, it is converted back to CPU to extract the Python integer.

Testing:
I do not have access to a Linux multi-GPU setup, so I could not reproduce the original crash.
Since the issue provides reproduction steps (#252), I’d appreciate if maintainers or contributors could verify this fix in that environment.

Notes
This change preserves NCCL performance while ensuring compatibility.
Fixes #252.

Pratham-Nayak1 · 2025-10-16T12:51:47Z

@kmk142789 Thanks for the review and approval! Sorry for the late reply — really appreciate your time and feedback.

fix:use LOCAL_RANK and move broadcast tensor to correct GPU for NCCL

7673025

kmk142789 approved these changes Oct 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix NCCL broadcast error on CPU tensors in distributed inference #257

Fix NCCL broadcast error on CPU tensors in distributed inference #257

Uh oh!

Pratham-Nayak1 commented Oct 1, 2025

Uh oh!

Pratham-Nayak1 commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix NCCL broadcast error on CPU tensors in distributed inference #257

Are you sure you want to change the base?

Fix NCCL broadcast error on CPU tensors in distributed inference #257

Uh oh!

Conversation

Pratham-Nayak1 commented Oct 1, 2025

Uh oh!

Pratham-Nayak1 commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants