Conversation
|
This pull request is being automatically deployed with Vercel (learn more). 🔍 Inspect: https://vercel.com/vwxyzjn/cleanrl/9tPNxQNMcQfs1X6PgsoMTTb7YCW8 |
5b3a5ff to
8c8dc56
Compare
8c8dc56 to
2c85aba
Compare
dosssman
left a comment
There was a problem hiding this comment.
Quite useful addition. Will try to properly use it for the next benchmarks.
|
Added more documentation. Going to merge this soon. |
| import argparse | ||
| import shlex | ||
| import subprocess | ||
|
|
||
|
|
||
| def parse_args(): | ||
| # fmt: off | ||
| parser = argparse.ArgumentParser() | ||
| parser.add_argument("--env-ids", nargs="+", default=["CartPole-v1", "Acrobot-v1", "MountainCar-v0"], | ||
| help="the ids of the environment to benchmark") | ||
| parser.add_argument("--command", type=str, default="poetry run python cleanrl/ppo.py", | ||
| help="the command to run") | ||
| parser.add_argument("--num-seeds", type=int, default=3, | ||
| help="the number of random seeds") | ||
| parser.add_argument('--workers', type=int, default=0, | ||
| help='the number of eval workers to run benchmark experimenets (skips evaluation when set to 0)') | ||
| args = parser.parse_args() | ||
| # fmt: on | ||
| return args | ||
|
|
||
|
|
||
| def run_experiment(command: str): | ||
| command_list = shlex.split(command) | ||
| print(f"running {command}") | ||
| fd = subprocess.Popen(command_list) | ||
| return_code = fd.wait() | ||
| assert return_code == 0 | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| args = parse_args() | ||
| commands = [] | ||
| for seed in range(1, args.num_seeds + 1): | ||
| for env_id in args.env_ids: | ||
| commands += [" ".join([args.command, "--env-id", env_id, "--seed", str(seed)])] | ||
|
|
||
| print(commands) | ||
|
|
||
| if args.workers > 0: | ||
| from concurrent.futures import ThreadPoolExecutor | ||
|
|
||
| executor = ThreadPoolExecutor(max_workers=args.workers, thread_name_prefix="cleanrl-benchmark-worker-") | ||
| for command in commands: | ||
| executor.submit(run_experiment, command) | ||
| executor.shutdown(wait=True, cancel_futures=False) |
There was a problem hiding this comment.
If it simply spawns multiple processes, why don't we instead write a bash script?
There was a problem hiding this comment.
I’m open to using a bash script. But the main benefit here is pythons code is a bit easier to read and also allows me to set a maximum number of workers. I don’t know if I can Set a maximum number of workers with bash.
There was a problem hiding this comment.
I think u can use poetry run python cleanrl/ppo.py & to spawn a new process and run it in the background.
There was a problem hiding this comment.
True. The issue is setting a maximum amount of workers. Imagine I have 60 commands to run, poetry run python cleanrl/ppo.py & is just gonna overflow my CPU.
|
Merging now. |
Description
This PR introduces utilities that help us to run the benchmark experiments more smoothly.
Previously, we relied on a very simple mechanism for conducting benchmark experiments such as
cleanrl/benchmark/sac/mujoco.sh
Lines 3 to 7 in 443bb14
Such bash script usage is pretty straightforward but lacks flexibility and configurability. This PR introduce a command such as
OMP_NUM_THREADS=1 python -m cleanrl_utils.benchmark \ --env-ids CartPole-v1 Acrobot-v1 MountainCar-v0 \ --command "poetry run python cleanrl/ppo.py --track --cuda False" \ --num-seeds 3 \ --workers 3which will automatically run experiments with
workers=3subprocesses. A full example can be seen here:Types of changes
Checklist:
pre-commit run --all-filespasses (required).mkdocs serve.