all.environments

AtariEnvironment(name[, device])

DuplicateEnvironment(envs[, device])

Turns a list of ALL Environment objects into a VectorEnvironment object

Environment()

A reinforcement learning Environment.

GymEnvironment(id[, device, name, ...])

A wrapper for OpenAI Gym environments (see: https://gymnasium.openai.com).

GymVectorEnvironment(vec_env, name[, device])

A wrapper for Gym's vector environments (see: https://github.com/openai/gym/blob/master/gym/vector/vector_env.py).

MultiagentAtariEnv(env_name[, device])

A wrapper for PettingZoo Atari environments (see: https://www.pettingzoo.ml/atari).

MultiagentEnvironment()

A multiagent reinforcement learning Environment.

MultiagentPettingZooEnv(zoo_env, name[, device])

A wrapper for generael PettingZoo environments (see: https://www.pettingzoo.ml/).

MujocoEnvironment(id[, device, name, no_info])

A Mujoco Environment

PybulletEnvironment(name, **kwargs)

VectorEnvironment()

A reinforcement learning vector Environment.

class all.environments.AtariEnvironment(name, device='cpu', **gym_make_kwargs)

Bases: Environment

property action_space

The Space representing the range of possible actions.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)

Create n copies of this environment.

property env
property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

render(**kwargs)

Render the current environment state.

reset()

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

seed(seed)
property state

The State of the Environment at the current timestep.

property state_space

The Space representing the range of observable states.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

step(action)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.DuplicateEnvironment(envs, device=torch.device)

Bases: VectorEnvironment

Turns a list of ALL Environment objects into a VectorEnvironment object

This wrapper just takes the list of States the environments generate and outputs a StateArray object containing all of the environment states. Like all vector environments, the sub environments are automatically reset when done.

Parameters:
  • envs – A list of ALL environments

  • device (optional) – the device on which tensors will be stored

property action_space

The Space representing the range of possible actions for each environment.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

property name

The name of the environment.

property num_envs

Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

reset(seed=None, **kwargs)

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

property state_array

A StateArray of the Environments at the current timestep.

property state_space

The Space representing the range of observable states for each environment.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

step(actions)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.Environment

Bases: ABC

A reinforcement learning Environment.

In reinforcement learning, an Agent learns by interacting with an Environment. An Environment defines the dynamics of a particular problem: the states, the actions, the transitions between states, and the rewards given to the agent. Environments are often used to benchmark reinforcement learning agents, or to define real problems that the user hopes to solve using reinforcement learning.

abstract property action_space

The Space representing the range of possible actions.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

abstract close()

Clean up any extraneous environment objects.

abstract property device

The torch device the environment lives on.

abstract duplicate(n)

Create n copies of this environment.

abstract property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

abstract render(**kwargs)

Render the current environment state.

abstract reset()

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

abstract property state

The State of the Environment at the current timestep.

abstract property state_space

The Space representing the range of observable states.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

abstract step(action)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.GymEnvironment(id, device=torch.device, name=None, legacy_gym=False, wrap_env=None, **gym_make_kwargs)

Bases: Environment

A wrapper for OpenAI Gym environments (see: https://gymnasium.openai.com).

This wrapper converts the output of the gym environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent. This constructor supports either a string, which will be passed to the gymnasium.make(name) function, or a preconstructed gym environment. Note that in the latter case, the name property is set to be the whatever the name of the outermost wrapper on the environment is.

Parameters:
  • env – Either a string or an OpenAI gym environment

  • name (str, optional) – the name of the environment

  • device (str, optional) – the device on which tensors will be stored

  • legacy_gym (str, optional) – If true, calls gym.make() instead of gymnasium.make()

  • wrap_env (function, optional) – A function that accepts an environment and returns a wrapped environment.

  • **gym_make_kwargs – kwargs passed to gymnasium.make(id, **gym_make_kwargs)

property action_space

The Space representing the range of possible actions.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)

Create n copies of this environment.

property env
property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

render(**kwargs)

Render the current environment state.

reset(**kwargs)

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

seed(seed)
property state

The State of the Environment at the current timestep.

property state_space

The Space representing the range of observable states.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

step(action)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.GymVectorEnvironment(vec_env, name, device=torch.device)

Bases: VectorEnvironment

A wrapper for Gym’s vector environments (see: https://github.com/openai/gym/blob/master/gym/vector/vector_env.py).

This wrapper converts the output of the vector environment to PyTorch tensors, and wraps them in a StateArray object that can be passed to a Parallel Agent. This constructor accepts a preconstructed gym vector environment. Note that in the latter case, the name property is set to be the whatever the name of the outermost wrapper on the environment is.

Parameters:
  • vec_env – An OpenAI gym vector environment

  • device (optional) – the device on which tensors will be stored

property action_space

The Space representing the range of possible actions for each environment.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

property name

The name of the environment.

property num_envs

Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

reset(**kwargs)

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

property state_array

A StateArray of the Environments at the current timestep.

property state_space

The Space representing the range of observable states for each environment.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

step(action)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.MujocoEnvironment(id, device=torch.device, name=None, no_info=True, **gym_make_kwargs)

Bases: GymEnvironment

A Mujoco Environment

property action_space

The Space representing the range of possible actions.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)

Create n copies of this environment.

property env
property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

render(**kwargs)

Render the current environment state.

reset(**kwargs)

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

seed(seed)
property state

The State of the Environment at the current timestep.

property state_space

The Space representing the range of observable states.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

step(action)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.MultiagentAtariEnv(env_name, device='cuda', **pettingzoo_params)

Bases: MultiagentPettingZooEnv

A wrapper for PettingZoo Atari environments (see: https://www.pettingzoo.ml/atari).

This wrapper converts the output of the PettingZoo environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent.

Parameters:
  • env_name (string) – A string representing the name of the environment (e.g. pong-v1)

  • device (optional) – the device on which tensors will be stored

action_space(agent_id)

The action space for the given agent.

agent_iter()

Create an iterable which that the next element is always the name of the agent whose turn it is to act.

Returns:

An Iterable over Agent strings.

property agent_selection
close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)
is_done(agent)

Determine whether a given agent is done.

Parameters:

agent (str) – The name of the agent.

Returns:

A boolean representing whether the given agent is done.

last()

Get the MultiagentState object for the current agent.

Returns:

The all.core.MultiagentState object for the current agent.

property name

The name of the environment.

Type:

str

observation_space(agent_id)

Alias for MultiagentEnvironment.state_space(agent_id).

render(**kwargs)

Render the current environment state.

reset(**kwargs)

Reset the environment and return a new initial state for the first agent.

Returns

all.core.MultiagentState: The initial state for the next episode.

seed(seed)
property state

The State for the current agent.

state_space(agent_id)

The state space for the given agent.

step(action)

Apply an action for the current agent and get the multiagent state for the next agent.

Parameters:

action – The Action for the current agent and timestep.

Returns:

The state for the next agent.

Return type:

all.core.MultiagentState

class all.environments.MultiagentEnvironment

Bases: ABC

A multiagent reinforcement learning Environment.

The Multiagent variant of the Environment object. An Environment defines the dynamics of a particular problem: the states, the actions, the transitions between states, and the rewards given to the agent. Environments are often used to benchmark reinforcement learning agents, or to define real problems that the user hopes to solve using reinforcement learning.

abstract action_space()

The action space for the given agent.

abstract agent_iter()

Create an iterable which that the next element is always the name of the agent whose turn it is to act.

Returns:

An Iterable over Agent strings.

abstract close()

Clean up any extraneous environment objects.

abstract property device

The torch device the environment lives on.

abstract is_done(agent)

Determine whether a given agent is done.

Parameters:

agent (str) – The name of the agent.

Returns:

A boolean representing whether the given agent is done.

abstract last()

Get the MultiagentState object for the current agent.

Returns:

The all.core.MultiagentState object for the current agent.

abstract property name

The name of the environment.

Type:

str

observation_space(agent_id)

Alias for MultiagentEnvironment.state_space(agent_id).

abstract render(**kwargs)

Render the current environment state.

abstract reset()

Reset the environment and return a new initial state for the first agent.

Returns

all.core.MultiagentState: The initial state for the next episode.

property state

The State for the current agent.

abstract state_space(agent_id)

The state space for the given agent.

abstract step(action)

Apply an action for the current agent and get the multiagent state for the next agent.

Parameters:

action – The Action for the current agent and timestep.

Returns:

The state for the next agent.

Return type:

all.core.MultiagentState

class all.environments.MultiagentPettingZooEnv(zoo_env, name, device='cuda')

Bases: MultiagentEnvironment

A wrapper for generael PettingZoo environments (see: https://www.pettingzoo.ml/).

This wrapper converts the output of the PettingZoo environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent.

Parameters:
  • zoo_env (AECEnv) – A PettingZoo AECEnv environment (e.g. pettingzoo.mpe.simple_push_v2)

  • device (optional) – the device on which tensors will be stored

action_space(agent_id)

The action space for the given agent.

agent_iter()

Create an iterable which that the next element is always the name of the agent whose turn it is to act.

Returns:

An Iterable over Agent strings.

property agent_selection
close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)
is_done(agent)

Determine whether a given agent is done.

Parameters:

agent (str) – The name of the agent.

Returns:

A boolean representing whether the given agent is done.

last()

Get the MultiagentState object for the current agent.

Returns:

The all.core.MultiagentState object for the current agent.

property name

The name of the environment.

Type:

str

observation_space(agent_id)

Alias for MultiagentEnvironment.state_space(agent_id).

render(**kwargs)

Render the current environment state.

reset(**kwargs)

Reset the environment and return a new initial state for the first agent.

Returns

all.core.MultiagentState: The initial state for the next episode.

seed(seed)
property state

The State for the current agent.

state_space(agent_id)

The state space for the given agent.

step(action)

Apply an action for the current agent and get the multiagent state for the next agent.

Parameters:

action – The Action for the current agent and timestep.

Returns:

The state for the next agent.

Return type:

all.core.MultiagentState

class all.environments.PybulletEnvironment(name, **kwargs)

Bases: GymEnvironment

property action_space

The Space representing the range of possible actions.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)

Create n copies of this environment.

property env
property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

render(**kwargs)

Render the current environment state.

reset(**kwargs)

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

seed(seed)
short_names = {'ant': 'AntBulletEnv-v0', 'cheetah': 'HalfCheetahBulletEnv-v0', 'hopper': 'HopperBulletEnv-v0', 'humanoid': 'HumanoidBulletEnv-v0', 'walker': 'Walker2DBulletEnv-v0'}
property state

The State of the Environment at the current timestep.

property state_space

The Space representing the range of observable states.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

step(action)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.VectorEnvironment

Bases: ABC

A reinforcement learning vector Environment.

Similar to a regular RL environment except many environments are stacked together in the observations, rewards, and dones, and the vector environment expects an action to be given for each environment in step.

Also, since sub-environments are done at different times, you do not need to manually reset the environments when they are done, rather the vector environment automatically resets environments when they are complete.

abstract property action_space

The Space representing the range of possible actions for each environment.

Returns:

An object of type Space that represents possible actions the agent may take

Return type:

Space

abstract close()

Clean up any extraneous environment objects.

abstract property device

The torch device the environment lives on.

abstract property name

The name of the environment.

abstract property num_envs

Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.

property observation_space

Alias for Environment.state_space.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

abstract reset()

Reset the environment and return a new initial state.

Returns:

The initial state for the next episode.

Return type:

State

abstract property state_array

A StateArray of the Environments at the current timestep.

abstract property state_space

The Space representing the range of observable states for each environment.

Returns:

An object of type Space that represents possible states the agent may observe

Return type:

Space

abstract step(action)

Apply an action and get the next state.

Parameters:

action (Action) – The action to apply at the current time step.

Returns:

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action