Skip to content

Data Ordering Consistency in MASAC Algorithm - Critical for Custom Critic Implementation #222

@Safari-1999

Description

@Safari-1999

I'm using BenchMARL for my project with a custom environment and custom actor/critic models. My setup uses MASAC algorithm with continuous, centralized environment containing 3 agents in a single group with share_param_critic=False (each agent has its own critic).
In my critic model design, each critic needs to see all agents' states and actions. My critic has 3 input heads:

Head 1: Current agent's state + action
Head 2: Other agent's state + action
Head 3: Other agent's state + action

For this to work correctly, I need to know the exact ordering of data passed to the forward function.
I receive global_action through keys sent by MASAC and construct global_state from observations (following MPE example pattern).
My critical question: Does the data ordering remain consistent throughout the algorithm pipeline (algorithm → buffer → loss computation in TorchRL)?
Expected ordering:

[act0, act1, act2] for actions
[obs0, obs1, obs2] for observations

I attempted to test this ordering by creating a fixed-action actor and fixed-state task, but couldn't reach a definitive answer. Since this is critical for my project's correctness, I decided to ask the team directly.
Environment Details

Algorithm: MASAC
Environment: Custom continuous, centralized
Agents: 3 agents, single group
Critic: Individual critics (share_param_critic=False)
Architecture: Each critic processes global state + global actions

Specific Request
Can you confirm that the data ordering [agent0, agent1, agent2] remains consistent throughout the entire training pipeline, or does it change at any point during algorithm execution, buffer storage, or loss computation?
Thank you for your attention and cooperation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions