[Feature] Differentiable VMAS by matteobettini · Pull Request #80 · proroklab/VectorizedMultiAgentSimulator

matteobettini · 2024-02-05T17:41:59Z

VMAS is now fully differentiable and you can backporpagate through any of its scenarios.

To enable this, set grad_enabled=True at env construction

You can then do stuff like:

for step in steps:
    actions = []
    for agent in agents:
        action = ....
        action.requires_grad_(True)
        if step == 0:
            first_action = action
        actions.append(action)
    obs, rews, dones, info = env.step(actions)

loss = obs[-1].mean() + rews[-1].mean()
grad = torch.autograd.grad(loss, first_action)

Which will backpropagate a loss computed using observation and reward through time, back to input action in the first timestep

amend

dd86ac4

matteobettini mentioned this pull request Feb 5, 2024

Differentiability of rewards #25

Closed

matteobettini added 4 commits February 6, 2024 09:16

amend

1d1da1b

amend

009a5fa

amend

77cbad8

amend

07d84ba

matteobettini marked this pull request as ready for review February 6, 2024 12:26

matteobettini added 2 commits February 6, 2024 13:43

amend

c90e676

amend

35fd080

matteobettini merged commit 7c453bf into main Feb 6, 2024

matteobettini deleted the differentiable branch February 6, 2024 16:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Differentiable VMAS#80

[Feature] Differentiable VMAS#80
matteobettini merged 7 commits intomainfrom
differentiable

matteobettini commented Feb 5, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

matteobettini commented Feb 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

matteobettini commented Feb 5, 2024 •

edited

Loading