Skip to content

Multi-agent environment with early agents termination #204

@ColdFrenzy

Description

@ColdFrenzy

Hi, I'm trying to work on a simple multi-agent navigation task where agents need to reach a goal position.
I'm giving a little bit of context:
Whenever an agent reaches the desired goal, its done flag becomes true, and whenever an action is passed, I just skip this particular agent and I return a special observation with all -1 and a 0 reward.
I've noticed that in the tensordict generated by a rollout, there are multiple keys {a1,a2,..., done, termination, truncation}.
Since I'm not setting these global done, termination, and truncation keys manually, I was wondering how they are set. In my case, I would like done to be true only when all the agents reach the goal, i.e., {a1: {done==True},a2: {done==True} ...}
Is this already the case?
If not, is it possible to change it? I would like the episode to terminate only when all the agents reach the goal or when the max_timestep is reached.
I'm asking this because I saw that agents are able to reach the goal individually (by looking at my metrics), but they were not even once able to reach the goal at the same time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions