Skip to content

Commit

Permalink
Merge pull request #10 from JGCRI/develop
Browse files Browse the repository at this point in the history
Releasing v0.6.0
  • Loading branch information
sash19 authored Nov 13, 2024
2 parents 75023a5 + 74f96f0 commit 4bc8b12
Show file tree
Hide file tree
Showing 30 changed files with 682 additions and 429 deletions.
Empty file modified .gitattributes
100644 → 100755
Empty file.
27 changes: 27 additions & 0 deletions .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: documentation

on: [push, pull_request, workflow_dispatch]

permissions:
contents: write

jobs:
docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- name: Install dependencies
run: |
pip install sphinx sphinx_rtd_theme myst_parser
- name: Sphinx build
run: |
sphinx-build docs _build
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/main' }}
with:
publish_branch: gh-pages
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: _build/
force_orphan: true
1 change: 1 addition & 0 deletions .gitignore
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@ communicator/src/communicator
containers/
config_dict.yaml
cache/
docs/_build/
logs/
.vscode/
22 changes: 22 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version, and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.12"

# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/conf.py

# Optionally, but recommended,
# declare the Python requirements required to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
Empty file modified LICENSE.md
100644 → 100755
Empty file.
14 changes: 7 additions & 7 deletions README.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Scalable
[v0.5.7](https://github.com/JGCRI/scalable/tree/0.5.7)
[v0.6.0](https://github.com/JGCRI/scalable/tree/0.6.0)

Scalable is a Python library which aids in running complex workflows on HPCs by orchestrating multiple containers, requesting appropriate HPC jobs to the scheduler, and providing a python environment for distributed computing. It's designed to be primarily used with JGCRI Climate Models but can be easily adapted for any arbitrary uses.

Expand Down Expand Up @@ -80,9 +80,9 @@ cluster.add_container(tag="osiris", cpus=8, memory="20G", dirs={"/rcfs/projects/
Before launching the workers, the configuration of worker or container targets needs to be specified. The containers to be launched as workers need to be first added by specifying their tag, number of cpu cores they need, the memory they would need, and the directory on the HPC Host to bind to the containers so that these directories are accessible by the container.

```python
cluster.add_worker(n=3, tag="gcam")
cluster.add_worker(n=2, tag="stitches")
cluster.add_worker(n=3, tag="osiris")
cluster.add_workers(n=3, tag="gcam")
cluster.add_workers(n=2, tag="stitches")
cluster.add_workers(n=3, tag="osiris")
```

Launching workers on the cluster can be done by just adding workers to the cluster. This call will only be successful if the tags used have also had containers with the same tag added beforehand. Removing workers is similarly as easy.
Expand Down Expand Up @@ -156,12 +156,12 @@ def func3(param):

```

In the example above, the functions will wait 5, 3, and 10 seconds for the first time they are computed. However, their results will be cached due to the decorator and so, if the functions are ran again with the same arguments, their results are going to be returned from memory instead and they wouldn't sleep. There are arguments which directly can be given to the cacheable decorator. **It is always recommended to specify the return type and the type of arguments for each use.** This ensures expected functioning of the module and for correct caching. --TODO--
In the example above, the functions will wait 5, 3, and 10 seconds for the first time they are computed. However, their results will be cached due to the decorator and so, if the functions are ran again with the same arguments, their results are going to be returned from memory instead and they wouldn't sleep. There are arguments which directly can be given to the cacheable decorator. **It is always recommended to specify the return type and the type of arguments for each use.** This ensures expected functioning of the module and for correct caching.

## Contact

For any contribution, questions, or requests, please feel free to [open an issue](https://github.com/JGCRI/scalable/issues) or contact us directly:
**Shashank Lamba** [shashank.lamba@pnnl.gov](mailto:shashank.lamba@pnnl.gov)
For any contribution, questions, or requests, please feel free to [open an issue](https://github.com/JGCRI/scalable/issues) or contact us directly:\
**Shashank Lamba** [shashank.lamba@pnnl.gov](mailto:shashank.lamba@pnnl.gov)\
**Pralit Patel** [pralit.patel@pnnl.gov](mailto:pralit.patel@pnnl.gov)

## [License](https://github.com/JGCRI/scalable/blob/master/LICENSE.md)
29 changes: 19 additions & 10 deletions communicator/src/communicator.go
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -17,26 +17,28 @@ import (
// Changing CONNECTION_TYPE is not recommended

const (
DEFAULT_HOST = "0.0.0.0"
DEFAULT_PORT = "1919"
CONNECTION_TYPE = "tcp"
DEFAULT_HOST = "0.0.0.0"
DEFAULT_PORT = "1919"
CONNECTION_TYPE = "tcp"
NUM_PORT_RETRIES = 5
)

var BUFFER_LEN = 5120

func main() {
arguments := os.Args[1:]
listen_port := DEFAULT_PORT
if len(arguments) > 1 {
listen_port = arguments[1]
} else if len(arguments) == 0 {
fmt.Println("Either -s or -c option needed")
argslen := len(arguments)
if argslen == 0 {
fmt.Println("Either -s or -c option needed. Use -h for help.")
gracefulExit()
} else if argslen > 1 {
listen_port = arguments[1]
}
if arguments[0] == "-s" {
loop := 0
server, err := net.Listen(CONNECTION_TYPE, DEFAULT_HOST+":"+listen_port)
for err != nil && loop < 5 && len(arguments) <= 1 {
for err != nil && loop < NUM_PORT_RETRIES && argslen <= 1 {
listen_port = strconv.Itoa(rand.Intn(40000-2000) + 2000)
server, err = net.Listen(CONNECTION_TYPE, DEFAULT_HOST+":"+listen_port)
loop++
Expand Down Expand Up @@ -119,6 +121,13 @@ func main() {
received += read
}
fmt.Print(output.String())
} else if arguments[0] == "-h" {
fmt.Println("Usage: communicator [OPTION] [PORT]")
fmt.Println("Options:")
fmt.Println(" -s\t\tStart server")
fmt.Println(" -c\t\tStart client")
fmt.Println(" -h\t\tShow help")
fmt.Println("PORT is optional and defaults to 1919 for server use.")
} else {
fmt.Println("Invalid option")
gracefulExit()
Expand All @@ -141,7 +150,7 @@ func handleRequest(client net.Conn) {
buffer := make([]byte, BUFFER_LEN)
received := 0
flag := 0
for received < len(lenBuffer) {
for received < len(lenBuffer) && received < BUFFER_LEN {
read, err := client.Read(lenBuffer[received:])
if err != nil {
clientClose(err, client)
Expand Down Expand Up @@ -230,7 +239,7 @@ func handleRequest(client net.Conn) {
return
}
sent = 0
for sent < len(lastOutput.String()) {
for sent < len(lastOutput.String()) && sent < BUFFER_LEN {
wrote, err := client.Write(([]byte(lastOutput.String()))[sent:])
if err != nil {
clientClose(err, client)
Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Empty file added docs/_static/custom.css
Empty file.
17 changes: 17 additions & 0 deletions docs/caching.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Caching
=======

.. autofunction:: scalable.cacheable

.. autoclass:: scalable.GenericType
:exclude-members: __init__

.. autoclass:: scalable.FileType

.. autoclass:: scalable.DirType

.. autoclass:: scalable.ValueType

.. autoclass:: scalable.ObjectType

.. autoclass:: scalable.UtilityType
45 changes: 45 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

import os
import sys

sys.path.insert(0, os.path.abspath('..'))
sys.path.insert(0, os.path.abspath('../scalable'))

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'Scalable'
copyright = '2024, Joint Global Change Research Institute'
author = 'Shashank Lamba, Pralit Patel'
release = '0.6.0'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = ["sphinx.ext.autodoc", "sphinx.ext.todo", "sphinx.ext.viewcode", "sphinx.ext.napoleon"]

templates_path = ['_templates']
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']

autodoc_default_options = {
'members': True,
'undoc-members': True,
'private-members': False,
'special-members': '__init__',
'inherited-members': False,
'show-inheritance': False,
'no-index': True,
}

# add_module_names = False

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'sphinx_rtd_theme'
html_static_path = ['_static']
html_css_files = ['custom.css']
15 changes: 15 additions & 0 deletions docs/functions.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Submitting Functions
====================

.. autoclass:: scalable.ScalableClient
:exclude-members: submit, map, get_versions, cancel, close

.. autofunction:: scalable.ScalableClient.submit

.. autofunction:: scalable.ScalableClient.map

.. autofunction:: scalable.ScalableClient.get_versions

.. autofunction:: scalable.ScalableClient.cancel

.. autofunction:: scalable.ScalableClient.close
Binary file added docs/images/scalable_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 48 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
.. Scalable documentation master file, created by
sphinx-quickstart on Thu Aug 22 10:55:42 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Scalable Documentation
======================

Scalable is a Python library for running complex workflows on HPC systems
efficiently and with minimal manual intervention. It uses a dask backend and a
range of custom programs to achieve this. The figure below shows the general
architecture of Scalable.

.. image:: images/scalable_architecture.png
:align: center

These questions can help answering if Scalable would be useful for you:

* Is your workflow ran on a HPC system and takes a significant amount of time?
* Does your workflow involve pipelines, where outputs from certain functions or
models are passed as inputs to other functions or models?
* Do you want the hardware allocation to be done automatically?


Scalable could be useful if one of more of the above questions are affirmative.
To incorporate the ability to run functions under different environments,
docker containers can be used. A Dockerfile with multiple targets can be used
to make multiple containers, each with different installed libraries and models.
When adding workers to cluster, it can be specified how many workers of
each type should be added.

Contents:
---------

.. toctree::
:maxdepth: 1

workers

.. toctree::
:maxdepth: 1

caching

.. toctree::
:maxdepth: 1

functions
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
2 changes: 2 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
sphinx-rtd-theme
scalable
16 changes: 16 additions & 0 deletions docs/workers.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Worker Management
=================


.. autoclass:: scalable.SlurmCluster
:exclude-members: close, job_cls, set_default_request_quantity

.. autofunction:: scalable.SlurmCluster.add_container

.. autofunction:: scalable.SlurmCluster.add_workers

.. autofunction:: scalable.SlurmCluster.remove_workers

.. autofunction:: scalable.SlurmCluster.close

.. autofunction:: scalable.SlurmCluster.set_default_request_quantity
3 changes: 2 additions & 1 deletion pyproject.toml
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ dependencies = [
"joblib >= 1.3.2",
"xxhash >= 3.4.1",
"versioneer >= 0.29",
"numpy >= 1.26.4",
"pandas >= 2.2.3"
]
classifiers = [
"Development Status :: 4 - Beta",
Expand All @@ -48,7 +50,6 @@ test = [

[project.urls]
"Github" = "https://github.com/JGCRI/scalable/tree/master/scalable"
"Homepage" = "https://www.pnnl.gov"

[project.scripts]
scalable_bootstrap = "scalable.utilities:run_bootstrap"
Expand Down
Loading

0 comments on commit 4bc8b12

Please sign in to comment.