Device Mismatch Error with cpu and cuda

Hi, thank you for this repository! I've been trying to use BenchMARL with Melting Pot, but I keep encountering a Pytorch device mismatch error whenever I run my experiment. I'm currently using the configuration in #78, where the algorithm is IPPO, the train_device is "cuda", and the sampling_device is "cpu". The stack trace is below:

```
Traceback (most recent call last):
  File "/home/gridsan/rfan/Melting-Pot-MARL/melting_pot_run.py", line 27, in hydra_experiment
    experiment.run()
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/benchmarl/experiment/experiment.py", line 649, in run
    raise err
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/benchmarl/experiment/experiment.py", line 641, in run
    self._collection_loop()
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/benchmarl/experiment/experiment.py", line 718, in _collection_loop
    group_batch = self.algorithm.process_batch(group, group_batch)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/benchmarl/algorithms/ippo.py", line 246, in process_batch
    loss.value_estimator(
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/torchrl/objectives/value/advantages.py", line 79, in new_func
    return fun(self, *args, **kwargs)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/torchrl/objectives/value/advantages.py", line 68, in new_fun
    return fun(self, *args, **kwargs)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/tensordict/nn/common.py", line 328, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/torchrl/objectives/value/advantages.py", line 1468, in forward
    value, next_value = self._call_value_nets(
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/torchrl/objectives/value/advantages.py", line 527, in _call_value_nets
    data_out = _vmap_func(
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/_functorch/apis.py", line 203, in wrapped
    return vmap_impl(
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 331, in vmap_impl
    return _flat_vmap(
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 479, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/torchrl/objectives/utils.py", line 539, in decorated_module
    return module(*module_args)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/benchmarl/models/common.py", line 161, in forward
    tensordict = self._forward(tensordict)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/benchmarl/models/cnn.py", line 281, in _forward
    cnn_out = self.cnn.forward(input)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/torchrl/modules/models/multiagent.py", line 153, in forward
    output = self._empty_net(inputs)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/gridsan/rfan/.local/lib/python3.10/site-packages/torchrl/modules/models/models.py", line 542, in forward
    out = super().forward(inputs)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward
    input = module(input)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 554, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/state/partition1/llgrid/pkg/anaconda/python-ML-2025a/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 549, in _conv_forward
    return F.conv2d(
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
```

I'm running Python 3.10.14, Torch 2.5.1, and BenchMARL 1.5.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Device Mismatch Error with cpu and cuda #213

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Device Mismatch Error with cpu and cuda #213

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions