all.environments
|
|
|
Turns a list of ALL Environment objects into a VectorEnvironment object |
A reinforcement learning Environment. |
|
|
A wrapper for OpenAI Gym environments (see: https://gymnasium.openai.com). |
|
A wrapper for Gym's vector environments (see: https://github.com/openai/gym/blob/master/gym/vector/vector_env.py). |
|
A wrapper for PettingZoo Atari environments (see: https://www.pettingzoo.ml/atari). |
A multiagent reinforcement learning Environment. |
|
|
A wrapper for generael PettingZoo environments (see: https://www.pettingzoo.ml/). |
|
A Mujoco Environment |
|
|
A reinforcement learning vector Environment. |
- class all.environments.AtariEnvironment(name, device='cpu', **gym_make_kwargs)
Bases:
Environment
- property action_space
The Space representing the range of possible actions.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- duplicate(n)
Create n copies of this environment.
- property env
- property name
The name of the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- render(**kwargs)
Render the current environment state.
- reset()
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- seed(seed)
- property state
The State of the Environment at the current timestep.
- property state_space
The Space representing the range of observable states.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- step(action)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action
- class all.environments.DuplicateEnvironment(envs, device=torch.device)
Bases:
VectorEnvironment
Turns a list of ALL Environment objects into a VectorEnvironment object
This wrapper just takes the list of States the environments generate and outputs a StateArray object containing all of the environment states. Like all vector environments, the sub environments are automatically reset when done.
- Parameters:
envs – A list of ALL environments
device (optional) – the device on which tensors will be stored
- property action_space
The Space representing the range of possible actions for each environment.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- property name
The name of the environment.
- property num_envs
Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- reset(seed=None, **kwargs)
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- property state_array
A StateArray of the Environments at the current timestep.
- property state_space
The Space representing the range of observable states for each environment.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- step(actions)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action
- class all.environments.Environment
Bases:
ABC
A reinforcement learning Environment.
In reinforcement learning, an Agent learns by interacting with an Environment. An Environment defines the dynamics of a particular problem: the states, the actions, the transitions between states, and the rewards given to the agent. Environments are often used to benchmark reinforcement learning agents, or to define real problems that the user hopes to solve using reinforcement learning.
- abstract property action_space
The Space representing the range of possible actions.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- abstract close()
Clean up any extraneous environment objects.
- abstract property device
The torch device the environment lives on.
- abstract duplicate(n)
Create n copies of this environment.
- abstract property name
The name of the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- abstract render(**kwargs)
Render the current environment state.
- abstract reset()
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- abstract property state
The State of the Environment at the current timestep.
- abstract property state_space
The Space representing the range of observable states.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- abstract step(action)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action
- class all.environments.GymEnvironment(id, device=torch.device, name=None, legacy_gym=False, wrap_env=None, **gym_make_kwargs)
Bases:
Environment
A wrapper for OpenAI Gym environments (see: https://gymnasium.openai.com).
This wrapper converts the output of the gym environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent. This constructor supports either a string, which will be passed to the gymnasium.make(name) function, or a preconstructed gym environment. Note that in the latter case, the name property is set to be the whatever the name of the outermost wrapper on the environment is.
- Parameters:
env – Either a string or an OpenAI gym environment
name (str, optional) – the name of the environment
device (str, optional) – the device on which tensors will be stored
legacy_gym (str, optional) – If true, calls gym.make() instead of gymnasium.make()
wrap_env (function, optional) – A function that accepts an environment and returns a wrapped environment.
**gym_make_kwargs – kwargs passed to gymnasium.make(id, **gym_make_kwargs)
- property action_space
The Space representing the range of possible actions.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- duplicate(n)
Create n copies of this environment.
- property env
- property name
The name of the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- render(**kwargs)
Render the current environment state.
- reset(**kwargs)
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- seed(seed)
- property state
The State of the Environment at the current timestep.
- property state_space
The Space representing the range of observable states.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- step(action)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action
- class all.environments.GymVectorEnvironment(vec_env, name, device=torch.device)
Bases:
VectorEnvironment
A wrapper for Gym’s vector environments (see: https://github.com/openai/gym/blob/master/gym/vector/vector_env.py).
This wrapper converts the output of the vector environment to PyTorch tensors, and wraps them in a StateArray object that can be passed to a Parallel Agent. This constructor accepts a preconstructed gym vector environment. Note that in the latter case, the name property is set to be the whatever the name of the outermost wrapper on the environment is.
- Parameters:
vec_env – An OpenAI gym vector environment
device (optional) – the device on which tensors will be stored
- property action_space
The Space representing the range of possible actions for each environment.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- property name
The name of the environment.
- property num_envs
Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- reset(**kwargs)
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- property state_array
A StateArray of the Environments at the current timestep.
- property state_space
The Space representing the range of observable states for each environment.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- step(action)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action
- class all.environments.MujocoEnvironment(id, device=torch.device, name=None, no_info=True, **gym_make_kwargs)
Bases:
GymEnvironment
A Mujoco Environment
- property action_space
The Space representing the range of possible actions.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- duplicate(n)
Create n copies of this environment.
- property env
- property name
The name of the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- render(**kwargs)
Render the current environment state.
- reset(**kwargs)
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- seed(seed)
- property state
The State of the Environment at the current timestep.
- property state_space
The Space representing the range of observable states.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- step(action)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action
- class all.environments.MultiagentAtariEnv(env_name, device='cuda', **pettingzoo_params)
Bases:
MultiagentPettingZooEnv
A wrapper for PettingZoo Atari environments (see: https://www.pettingzoo.ml/atari).
This wrapper converts the output of the PettingZoo environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent.
- Parameters:
env_name (string) – A string representing the name of the environment (e.g. pong-v1)
device (optional) – the device on which tensors will be stored
- action_space(agent_id)
The action space for the given agent.
- agent_iter()
Create an iterable which that the next element is always the name of the agent whose turn it is to act.
- Returns:
An Iterable over Agent strings.
- property agent_selection
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- duplicate(n)
- is_done(agent)
Determine whether a given agent is done.
- Parameters:
agent (str) – The name of the agent.
- Returns:
A boolean representing whether the given agent is done.
- last()
Get the MultiagentState object for the current agent.
- Returns:
The all.core.MultiagentState object for the current agent.
- property name
The name of the environment.
- Type:
str
- observation_space(agent_id)
Alias for MultiagentEnvironment.state_space(agent_id).
- render(**kwargs)
Render the current environment state.
- reset(**kwargs)
Reset the environment and return a new initial state for the first agent.
- Returns
all.core.MultiagentState: The initial state for the next episode.
- seed(seed)
- property state
The State for the current agent.
- state_space(agent_id)
The state space for the given agent.
- step(action)
Apply an action for the current agent and get the multiagent state for the next agent.
- Parameters:
action – The Action for the current agent and timestep.
- Returns:
The state for the next agent.
- Return type:
- class all.environments.MultiagentEnvironment
Bases:
ABC
A multiagent reinforcement learning Environment.
The Multiagent variant of the Environment object. An Environment defines the dynamics of a particular problem: the states, the actions, the transitions between states, and the rewards given to the agent. Environments are often used to benchmark reinforcement learning agents, or to define real problems that the user hopes to solve using reinforcement learning.
- abstract action_space()
The action space for the given agent.
- abstract agent_iter()
Create an iterable which that the next element is always the name of the agent whose turn it is to act.
- Returns:
An Iterable over Agent strings.
- abstract close()
Clean up any extraneous environment objects.
- abstract property device
The torch device the environment lives on.
- abstract is_done(agent)
Determine whether a given agent is done.
- Parameters:
agent (str) – The name of the agent.
- Returns:
A boolean representing whether the given agent is done.
- abstract last()
Get the MultiagentState object for the current agent.
- Returns:
The all.core.MultiagentState object for the current agent.
- abstract property name
The name of the environment.
- Type:
str
- observation_space(agent_id)
Alias for MultiagentEnvironment.state_space(agent_id).
- abstract render(**kwargs)
Render the current environment state.
- abstract reset()
Reset the environment and return a new initial state for the first agent.
- Returns
all.core.MultiagentState: The initial state for the next episode.
- property state
The State for the current agent.
- abstract state_space(agent_id)
The state space for the given agent.
- abstract step(action)
Apply an action for the current agent and get the multiagent state for the next agent.
- Parameters:
action – The Action for the current agent and timestep.
- Returns:
The state for the next agent.
- Return type:
- class all.environments.MultiagentPettingZooEnv(zoo_env, name, device='cuda')
Bases:
MultiagentEnvironment
A wrapper for generael PettingZoo environments (see: https://www.pettingzoo.ml/).
This wrapper converts the output of the PettingZoo environment to PyTorch tensors, and wraps them in a State object that can be passed to an Agent.
- Parameters:
zoo_env (AECEnv) – A PettingZoo AECEnv environment (e.g. pettingzoo.mpe.simple_push_v2)
device (optional) – the device on which tensors will be stored
- action_space(agent_id)
The action space for the given agent.
- agent_iter()
Create an iterable which that the next element is always the name of the agent whose turn it is to act.
- Returns:
An Iterable over Agent strings.
- property agent_selection
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- duplicate(n)
- is_done(agent)
Determine whether a given agent is done.
- Parameters:
agent (str) – The name of the agent.
- Returns:
A boolean representing whether the given agent is done.
- last()
Get the MultiagentState object for the current agent.
- Returns:
The all.core.MultiagentState object for the current agent.
- property name
The name of the environment.
- Type:
str
- observation_space(agent_id)
Alias for MultiagentEnvironment.state_space(agent_id).
- render(**kwargs)
Render the current environment state.
- reset(**kwargs)
Reset the environment and return a new initial state for the first agent.
- Returns
all.core.MultiagentState: The initial state for the next episode.
- seed(seed)
- property state
The State for the current agent.
- state_space(agent_id)
The state space for the given agent.
- step(action)
Apply an action for the current agent and get the multiagent state for the next agent.
- Parameters:
action – The Action for the current agent and timestep.
- Returns:
The state for the next agent.
- Return type:
- class all.environments.PybulletEnvironment(name, **kwargs)
Bases:
GymEnvironment
- property action_space
The Space representing the range of possible actions.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- close()
Clean up any extraneous environment objects.
- property device
The torch device the environment lives on.
- duplicate(n)
Create n copies of this environment.
- property env
- property name
The name of the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- render(**kwargs)
Render the current environment state.
- reset(**kwargs)
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- seed(seed)
- short_names = {'ant': 'AntBulletEnv-v0', 'cheetah': 'HalfCheetahBulletEnv-v0', 'hopper': 'HopperBulletEnv-v0', 'humanoid': 'HumanoidBulletEnv-v0', 'walker': 'Walker2DBulletEnv-v0'}
- property state
The State of the Environment at the current timestep.
- property state_space
The Space representing the range of observable states.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- step(action)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action
- class all.environments.VectorEnvironment
Bases:
ABC
A reinforcement learning vector Environment.
Similar to a regular RL environment except many environments are stacked together in the observations, rewards, and dones, and the vector environment expects an action to be given for each environment in step.
Also, since sub-environments are done at different times, you do not need to manually reset the environments when they are done, rather the vector environment automatically resets environments when they are complete.
- abstract property action_space
The Space representing the range of possible actions for each environment.
- Returns:
An object of type Space that represents possible actions the agent may take
- Return type:
Space
- abstract close()
Clean up any extraneous environment objects.
- abstract property device
The torch device the environment lives on.
- abstract property name
The name of the environment.
- abstract property num_envs
Number of environments in vector. This is the number of actions step() expects as input and the number of observations, dones, etc returned by the environment.
- property observation_space
Alias for Environment.state_space.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- abstract reset()
Reset the environment and return a new initial state.
- Returns:
The initial state for the next episode.
- Return type:
- abstract property state_array
A StateArray of the Environments at the current timestep.
- abstract property state_space
The Space representing the range of observable states for each environment.
- Returns:
An object of type Space that represents possible states the agent may observe
- Return type:
Space
- abstract step(action)
Apply an action and get the next state.
- Parameters:
action (Action) – The action to apply at the current time step.
- Returns:
all.environments.State – The State of the environment after the action is applied. This State object includes both the done flag and any additional “info”
float – The reward achieved by the previous action