Multi-agent environment with early agents termination

Hi, I'm trying to work on a simple multi-agent navigation task where agents need to reach a goal position.
I'm giving a little bit of context: 
Whenever an agent reaches the desired goal, its done flag becomes true, and whenever an action is passed, I just skip this particular agent and I return a special observation with all -1 and a 0 reward.
I've noticed that in the tensordict generated by a rollout, there are multiple keys {a1,a2,..., done, termination, truncation}. 
Since I'm not setting these global done, termination, and truncation keys manually, I was wondering how they are set. In my case, I would like done to be true only when all the agents reach the goal, i.e., {a1: {done==True},a2: {done==True} ...}
Is this already the case? 
If not, is it possible to change it? I would like the episode to terminate only when all the agents reach the goal or when the max_timestep is reached. 
I'm asking this because I saw that agents are able to reach the goal individually (by looking at my metrics), but they were not even once able to reach the goal at the same time


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-agent environment with early agents termination #204

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-agent environment with early agents termination #204

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions