Slurm Search

Slurm search is a library for expressing, running, and managing distributed experiments on a cluster. It has a number of parts, including a storage system, a job management tool, an experiment management system, hyperparameter tuning algorithms, and integrations with a reinforcement learning library.

Installation

Install autonomous-learning-library and pytorch before installing this package. Run the following commands:

pip install .
mkdir -p ~/hyperparameters/locks # Locks assume their directory already exists.

Usage

Experiments are specified in slurm_search/experiments/. Using the run_exp command, you may:

# Launch an experiment with a series of override parameters.
run_exp hp_tuning_effects --agent=atari:a2c --env=atari:Breakout --search:threads=12

# Relaunch an interrupted experiment, including to repeat an updated result analysis.
run_exp hp_tuning_effects --agent=atari:a2c --env=atari:Breakout --search:threads=12 --resume=exp:my_exp_name

# Display the experiment AST visually
run_exp hp_tuning_effects --display-ast=true # Regular
run_exp hp_tuning_effects --display-ast=abstract # Abstract equations

Components

This package has the following parts:

A pickle-based object database that supports safe concurrent access across a cluster.
A session management system that integrates with the slurm job management system for parallel execution of tasks across a cluster.
Hyperopt integrations for parallel hyperparameter tuning searches.
An python experiment DSL and tools for configuring, running, and resuming experiments and tools for analyzing the results.
An integration with the Autonomoun Learning Library for running reinforcement learning experiments.

Files

Core:

experiments/: Experiment specifications, domain-of-interest integrations, and analysis tools.
display_experiment.py: Experiment management CLI tool.
experiment.py: Experiment DSL.
slurm_search.py: Slurm integration, session and state management CLI tool 'ssearch'.
search_session.py: Parallelized search session interface.
session_state.py: Transactional string-identified state store (uses locking and pickle).
locking.py: String-identified mutex (uses file locks).

Utilities:

params.py: Tools for handling parameter dict trees.
random_phrase.py: Generate random phrases / names for object identifiers.

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
slurm_search		slurm_search
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slurm Search

Installation

Usage

Components

Files

About

Releases

Packages

Languages

andrewsmike/slurm-search

Folders and files

Latest commit

History

Repository files navigation

Slurm Search

Installation

Usage

Components

Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages