all.environments

Environment()

A reinforcement learning Environment.

MultiagentEnvironment()

A multiagent reinforcement learning Environment.

GymEnvironment(env[, device, name])

A wrapper for OpenAI Gym environments (see: https://gym.openai.com).

AtariEnvironment(name, *args, **kwargs)

MultiagentAtariEnv(env_name[, device])

A wrapper for PettingZoo Atari environments (see: https://www.pettingzoo.ml/atari).

MultiagentPettingZooEnv(zoo_env, name[, device])

A wrapper for generael PettingZoo environments (see: https://www.pettingzoo.ml/).

GymVectorEnvironment(vec_env, name[, device])

A wrapper for Gym’s vector environments (see: https://github.com/openai/gym/blob/master/gym/vector/vector_env.py).

DuplicateEnvironment(envs[, device])

Turns a list of ALL Environment objects into a VectorEnvironment object

PybulletEnvironment(name, **kwargs)

class all.environments.AtariEnvironment(name, *args, **kwargs)

Bases: all.environments.gym.GymEnvironment

property action_space

The Space representing the range of possible actions.

Returns

An object of type Space that represents possible actions the agent may take

Return type

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)

Create n copies of this environment.

property env
property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

render(**kwargs)

Render the current environment state.

reset()

Reset the environment and return a new initial state.

Returns

The initial state for the next episode.

Return type

State

seed(seed)
property state

The State of the Environment at the current timestep.

property state_space

The Space representing the range of observable states.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

step(action)

Apply an action and get the next state.

Parameters

action (Action) – The action to apply at the current time step.

Returns

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.DuplicateEnvironment(envs, device=torch.device)

Bases: all.environments._vector_environment.VectorEnvironment

Turns a list of ALL Environment objects into a VectorEnvironment object

This wrapper just takes the list of States the environments generate and outputs a StateArray object containing all of the environment states. Like all vector environments, the sub environments are automatically reset when done.

Parameters
  • envs – A list of ALL environments

  • device (optional) – the device on which tensors will be stored

property action_space

The Space representing the range of possible actions for each environment.

Returns

An object of type Space that represents possible actions the agent may take

Return type

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

property name

The name of the environment.

property num_envs

Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.

property observation_space

Alias for Environment.state_space.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

reset()

Reset the environment and return a new initial state.

Returns

The initial state for the next episode.

Return type

State

seed(seed)
property state_array

A StateArray of the Environments at the current timestep.

property state_space

The Space representing the range of observable states for each environment.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

step(actions)

Apply an action and get the next state.

Parameters

action (Action) – The action to apply at the current time step.

Returns

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.Environment

Bases: abc.ABC

A reinforcement learning Environment.

In reinforcement learning, an Agent learns by interacting with an Environment. An Environment defines the dynamics of a particular problem: the states, the actions, the transitions between states, and the rewards given to the agent. Environments are often used to benchmark reinforcement learning agents, or to define real problems that the user hopes to solve using reinforcement learning.

abstract property action_space

The Space representing the range of possible actions.

Returns

An object of type Space that represents possible actions the agent may take

Return type

Space

abstract close()

Clean up any extraneous environment objects.

abstract property device

The torch device the environment lives on.

abstract duplicate(n)

Create n copies of this environment.

abstract property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

abstract render(**kwargs)

Render the current environment state.

abstract reset()

Reset the environment and return a new initial state.

Returns

The initial state for the next episode.

Return type

State

abstract property state

The State of the Environment at the current timestep.

abstract property state_space

The Space representing the range of observable states.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

abstract step(action)

Apply an action and get the next state.

Parameters

action (Action) – The action to apply at the current time step.

Returns

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.GymEnvironment(env, device=torch.device, name=None)

Bases: all.environments._environment.Environment

A wrapper for OpenAI Gym environments (see: https://gym.openai.com).

This wrapper converts the output of the gym environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent. This constructor supports either a string, which will be passed to the gym.make(name) function, or a preconstructed gym environment. Note that in the latter case, the name property is set to be the whatever the name of the outermost wrapper on the environment is.

Parameters
  • env – Either a string or an OpenAI gym environment

  • name (str, optional) – the name of the environment

  • device (str, optional) – the device on which tensors will be stored

property action_space

The Space representing the range of possible actions.

Returns

An object of type Space that represents possible actions the agent may take

Return type

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)

Create n copies of this environment.

property env
property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

render(**kwargs)

Render the current environment state.

reset()

Reset the environment and return a new initial state.

Returns

The initial state for the next episode.

Return type

State

seed(seed)
property state

The State of the Environment at the current timestep.

property state_space

The Space representing the range of observable states.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

step(action)

Apply an action and get the next state.

Parameters

action (Action) – The action to apply at the current time step.

Returns

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.GymVectorEnvironment(vec_env, name, device=torch.device)

Bases: all.environments._vector_environment.VectorEnvironment

A wrapper for Gym’s vector environments (see: https://github.com/openai/gym/blob/master/gym/vector/vector_env.py).

This wrapper converts the output of the vector environment to PyTorch tensors, and wraps them in a StateArray object that can be passed to a Parallel Agent. This constructor accepts a preconstructed gym vetor environment. Note that in the latter case, the name property is set to be the whatever the name of the outermost wrapper on the environment is.

Parameters
  • vec_env – An OpenAI gym vector environment

  • device (optional) – the device on which tensors will be stored

property action_space

The Space representing the range of possible actions for each environment.

Returns

An object of type Space that represents possible actions the agent may take

Return type

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

property name

The name of the environment.

property num_envs

Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.

property observation_space

Alias for Environment.state_space.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

reset()

Reset the environment and return a new initial state.

Returns

The initial state for the next episode.

Return type

State

seed(seed)
property state_array

A StateArray of the Environments at the current timestep.

property state_space

The Space representing the range of observable states for each environment.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

step(action)

Apply an action and get the next state.

Parameters

action (Action) – The action to apply at the current time step.

Returns

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action

class all.environments.MultiagentAtariEnv(env_name, device='cuda', **pettingzoo_params)

Bases: all.environments.multiagent_pettingzoo.MultiagentPettingZooEnv

A wrapper for PettingZoo Atari environments (see: https://www.pettingzoo.ml/atari).

This wrapper converts the output of the PettingZoo environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent.

Parameters
  • env_name (string) – A string representing the name of the environment (e.g. pong-v1)

  • device (optional) – the device on which tensors will be stored

property action_spaces

A dictionary of action spaces for each agent.

agent_iter()

Create an iterable which that the next element is always the name of the agent whose turn it is to act.

Returns

An Iterable over Agent strings.

property agent_selection
close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)
is_done(agent)

Determine whether a given agent is done.

Parameters

agent (str) – The name of the agent.

Returns

A boolean representing whether the given agent is done.

last()

Get the MultiagentState object for the current agent.

Returns

The all.core.MultiagentState object for the current agent.

property name

The name of the environment.

Type

str

property observation_spaces

Alias for MultiagentEnvironment.state_spaces.

render(mode='human')

Render the current environment state.

reset()

Reset the environment and return a new initial state for the first agent.

Returns

all.core.MultiagentState: The initial state for the next episode.

seed(seed)
property state

The State for the current agent.

property state_spaces

A dictionary of state spaces for each agent.

step(action)

Apply an action for the current agent and get the multiagent state for the next agent.

Parameters

action – The Action for the current agent and timestep.

Returns

The state for the next agent.

Return type

all.core.MultiagentState

class all.environments.MultiagentEnvironment

Bases: abc.ABC

A multiagent reinforcement learning Environment.

The Multiagent variant of the Environment object. An Environment defines the dynamics of a particular problem: the states, the actions, the transitions between states, and the rewards given to the agent. Environments are often used to benchmark reinforcement learning agents, or to define real problems that the user hopes to solve using reinforcement learning.

abstract property action_spaces

A dictionary of action spaces for each agent.

abstract agent_iter()

Create an iterable which that the next element is always the name of the agent whose turn it is to act.

Returns

An Iterable over Agent strings.

abstract close()

Clean up any extraneous environment objects.

abstract property device

The torch device the environment lives on.

abstract is_done(agent)

Determine whether a given agent is done.

Parameters

agent (str) – The name of the agent.

Returns

A boolean representing whether the given agent is done.

abstract last()

Get the MultiagentState object for the current agent.

Returns

The all.core.MultiagentState object for the current agent.

abstract property name

The name of the environment.

Type

str

property observation_spaces

Alias for MultiagentEnvironment.state_spaces.

abstract render(**kwargs)

Render the current environment state.

abstract reset()

Reset the environment and return a new initial state for the first agent.

Returns

all.core.MultiagentState: The initial state for the next episode.

property state

The State for the current agent.

abstract property state_spaces

A dictionary of state spaces for each agent.

abstract step(action)

Apply an action for the current agent and get the multiagent state for the next agent.

Parameters

action – The Action for the current agent and timestep.

Returns

The state for the next agent.

Return type

all.core.MultiagentState

class all.environments.MultiagentPettingZooEnv(zoo_env, name, device='cuda')

Bases: all.environments._multiagent_environment.MultiagentEnvironment

A wrapper for generael PettingZoo environments (see: https://www.pettingzoo.ml/).

This wrapper converts the output of the PettingZoo environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent.

Parameters
  • zoo_env (AECEnv) – A PettingZoo AECEnv environment (e.g. pettingzoo.mpe.simple_push_v2)

  • device (optional) – the device on which tensors will be stored

property action_spaces

A dictionary of action spaces for each agent.

agent_iter()

Create an iterable which that the next element is always the name of the agent whose turn it is to act.

Returns

An Iterable over Agent strings.

property agent_selection
close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)
is_done(agent)

Determine whether a given agent is done.

Parameters

agent (str) – The name of the agent.

Returns

A boolean representing whether the given agent is done.

last()

Get the MultiagentState object for the current agent.

Returns

The all.core.MultiagentState object for the current agent.

property name

The name of the environment.

Type

str

property observation_spaces

Alias for MultiagentEnvironment.state_spaces.

render(mode='human')

Render the current environment state.

reset()

Reset the environment and return a new initial state for the first agent.

Returns

all.core.MultiagentState: The initial state for the next episode.

seed(seed)
property state

The State for the current agent.

property state_spaces

A dictionary of state spaces for each agent.

step(action)

Apply an action for the current agent and get the multiagent state for the next agent.

Parameters

action – The Action for the current agent and timestep.

Returns

The state for the next agent.

Return type

all.core.MultiagentState

class all.environments.PybulletEnvironment(name, **kwargs)

Bases: all.environments.gym.GymEnvironment

property action_space

The Space representing the range of possible actions.

Returns

An object of type Space that represents possible actions the agent may take

Return type

Space

close()

Clean up any extraneous environment objects.

property device

The torch device the environment lives on.

duplicate(n)

Create n copies of this environment.

property env
property name

The name of the environment.

property observation_space

Alias for Environment.state_space.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

render(**kwargs)

Render the current environment state.

reset()

Reset the environment and return a new initial state.

Returns

The initial state for the next episode.

Return type

State

seed(seed)
short_names = {'ant': 'AntBulletEnv-v0', 'cheetah': 'HalfCheetahBulletEnv-v0', 'hopper': 'HopperBulletEnv-v0', 'humanoid': 'HumanoidBulletEnv-v0', 'walker': 'Walker2DBulletEnv-v0'}
property state

The State of the Environment at the current timestep.

property state_space

The Space representing the range of observable states.

Returns

An object of type Space that represents possible states the agent may observe

Return type

Space

step(action)

Apply an action and get the next state.

Parameters

action (Action) – The action to apply at the current time step.

Returns

  • all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”

  • float – The reward achieved by the previous action