Getting Started


The Autonomous Learning Library requires a recent version of PyTorch (at least v2.2.0 is recommended). Additionally, Tensorboard is required in order to enable logging. We also strongly recommend using a machine with a fast GPU with at least 11 GB of VRAM (a GTX 1080ti or better is preferred).


The autonomous-learning-library can be installed from PyPi using pip:

pip install autonomous-learning-library

This will only install the core library. If you want to install all included environments, run:

pip install autonomous-learning-library[all]

You can also install only a subset of the enviornments. For the list of optional dependencies, take a look at the

An alternate approach, that may be useful when following this tutorial, is to instead install by cloning the Github repository:

git clone
cd autonomous-learning-library
pip install -e .[dev]

dev will install all of the optional dependencies for developers of the repo, such as unit test dependencies, as well as all environments. If you chose to clone the repository, you can test your installation by running the unit test suite:

make test

This should also tell you if CUDA (the GPU driver) is available.

Running a Preset Agent

The goal of the Autonomous Learning Library is to provide components for building new agents. However, the library also includes a number of “preset” agent configurations for easy benchmarking and comparison, as well as some useful scripts. For example, an a2c agent can be run on CartPole as follows:

all-classic CartPole-v0 a2c

The results will be written to runs/a2c_CartPole-v0_<DATETIME>, <DATATIME> is generated by the library. You can view these results and other information through tensorboard:

tensorboard --logdir runs

By opening your browser to `http://localhost:6006`_, you should see a dashboard that looks something like the following (you may need to adjust the “smoothing” parameter):


If you want to compare agents in a nicer, format, you can use the plot script:

all-plot --logdir runs

This should give you a plot similar to the following:


In this plot, each point represents the average of the episodic returns over the last 100 episodes for every 100 episodes. The shaded region represents the standard deviation over that interval.

Finally, to watch the trained model in action, we provide a watch scripts for each preset module:

all-watch-classic CartPole-v0 runs/a2c_CartPole-v0_<DATETIME>/

You need to find the <id> by checking the runs directory.

Each of these scripts can be found the scripts directory of the main repository. Be sure to check out the all-atari and all-mujoco scripts for more fun!