Revisiting the Minimalist Approach to Offline Reinforcement Learning

Dependencies & Docker setup

To set up a python environment (with dev-tools of your taste, in our workflow, we use conda and python 3.8), just install all the requirements:

python3 install -r requirements.txt

However, in this setup, you must install mujoco210 binaries by hand. Sometimes this is not super straightforward, but this recipe can help:

mkdir -p /root/.mujoco \
    && wget https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz -O mujoco.tar.gz \
    && tar -xf mujoco.tar.gz -C /root/.mujoco \
    && rm mujoco.tar.gz
export LD_LIBRARY_PATH=/root/.mujoco/mujoco210/bin:${LD_LIBRARY_PATH}

You may also need to install additional dependencies for mujoco_py. We recommend following the official guide from mujoco_py.

Docker

We also provide a more straightforward way with a dockerfile that is already set up to work. All you have to do is build and run it :)

docker build -t rebrac .

To run, mount current directory:

docker run -it \
    --gpus=all \
    --rm \
    --volume "<PATH_TO_THE_REPO>:/workspace/" \
    --name rebrac \
    rebrac bash

V-D4RL

To reproduce V-D4RL, you need to download the corresponding datasets. The easiest way is probably to run the download_vd4rl.sh script we provide.

You can also do it manually with the following links to the datasets archives:

Note that provided links contain only datasets reported in the paper without distraction and multitasking.

After downloading the datasets, you must put the data into the vd4rl directory.

How to reproduce experiments

Training

Configs for the main experiments are stored in the configs/rebrac/<task_type> and configs/rebrac-vis/<task_type>. All available hyperparameters are listed in the rebrac/algorithms/rebrac.py for D4RL and rebrac/algorithms/rebrac_torch_vis.py for V-D4RL.

For example, to start ReBRAC training process with D4RL halfcheetah-medium-v2 dataset, run the following:

PYTHONPATH=. python3 src/algorithms/rebrac.py --config_path="configs/rebrac/halfcheetah/halfcheetah_medium.yaml"

For V-D4RL walker_walk-expert-v2 dataset, run the following:

PYTHONPATH=. python3 src/algorithms/rebrac_torch_vis.py --config_path="configs/rebrac-vis/walker_walk/expert.yaml"

Targeted Reproduction

For better transparency and replication, we release all the experiments (5k+) in the form of Weights & Biases reports.

If you want to replicate results from our work, you can use the configs for Weights & Biases Sweeps provided in the configs/sweeps. Note, we do not supply a codebase for both IQL and SAC-RND. However, in our work, we relied upon these implementations: IQL (CORL), SAC-RND (original implementation).

Paper element	Sweeps to run from `configs/sweeps/`
Tables 2, 3, 4	`eval/rebrac_d4rl_sweep.yaml`, `eval/td3_bc_d4rl_sweep.yaml`
Table 5	`eval/rebrac_visual_sweep.yaml`
Table 6	All sweeps from `ablations`
Figure 2	All sweeps from `network_sizes`
Hyperparameters tuning	All sweeps from `tuning`

Reliable Reports

We also provide scripts for reconstructing the graphs in our paper: eop/ReBRAC_ploting.ipynb, including performance profiles, probability of improvement, and expected online performance. For your convenience, we repacked the results into .pickle files, so you can re-use them for further research and head-to-head comparisons.

Citing

If you use this code for your research, please consider the following bibtex:

@article{tarasov2023revisiting,
  title={Revisiting the Minimalist Approach to Offline Reinforcement Learning},
  author={Denis Tarasov and Vladislav Kurenkov and Alexander Nikulin and Sergey Kolesnikov},
  journal={arXiv preprint arXiv:2305.09836},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
configs		configs
eop		eop
figures		figures
src		src
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
donwload_vd4rl.sh		donwload_vd4rl.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Revisiting the Minimalist Approach to Offline Reinforcement Learning

Dependencies & Docker setup

Docker

V-D4RL

How to reproduce experiments

Training

Targeted Reproduction

Reliable Reports

Citing

About

Releases

Packages

Contributors 2

Languages

License

tinkoff-ai/ReBRAC

Folders and files

Latest commit

History

Repository files navigation

Revisiting the Minimalist Approach to Offline Reinforcement Learning

Dependencies & Docker setup

Docker

V-D4RL

How to reproduce experiments

Training

Targeted Reproduction

Reliable Reports

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages