diff --git a/README.md b/README.md index 07ed41d55..de428061c 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,9 @@ # Hera - +Hera mascot -Hera is a Python framework for constructing and submitting Argo Workflows. The main goal of Hera is to make the Argo -ecosystem accessible by simplifying workflow construction and submission. +Hera makes Python code easy to orchestrate on Argo Workflows through native Python integrations. It lets you construct and +submit your Workflows entirely in Python. [See the Quick Start guide](https://hera.readthedocs.io/en/stable/walk-through/quick-start/) to start using Hera to orchestrate your Argo Workflows! @@ -13,9 +13,9 @@ The Argo was constructed by the shipwright Argus, and its crew were specially protected by the goddess Hera. ``` -### PyPi stats +### PyPI stats -[![Pypi](https://img.shields.io/pypi/v/hera.svg)](https://pypi.python.org/pypi/hera) +[![PyPI](https://img.shields.io/pypi/v/hera.svg)](https://pypi.python.org/pypi/hera) [![Versions](https://img.shields.io/pypi/pyversions/hera.svg)](https://github.com/argoproj-labs/hera) [![Downloads](https://static.pepy.tech/badge/hera)](https://pepy.tech/project/hera) @@ -128,15 +128,14 @@ w.create() ## Installation +| Source | Command | +|------------------------------------------------------|---------------------------------------------------------------------------------------------------------| +| [PyPI](https://pypi.org/project/hera/) | `pip install hera` | +| [GitHub repo](https://github.com/argoproj-labs/hera) | `python -m pip install git+https://github.com/argoproj-labs/hera --ignore-installed` | + > **Note** Hera went through a name change - from `hera-workflows` to `hera`. This is reflected in the published Python -> package. If you'd like to install versions prior to `5.0.0`, you have to use `hera-workflows`. Hera currently -> publishes releases to both `hera` and `hera-workflows` for backwards compatibility purposes. - -| Source | Command | -|----------------------------------------------------------|------------------------------------------------------------------------------------------------------| -| [PyPi](https://pypi.org/project/hera/) | `pip install hera` | -| [PyPi](https://pypi.org/project/hera-workflows/) | `pip install hera-workflows` | -| [GitHub repo](https://github.com/argoproj-labs/hera) | `python -m pip install git+https://github.com/argoproj-labs/hera --ignore-installed; pip install .` | +> package. If you'd like to install versions prior to `5.0.0`, you should do `pip install hera-workflows<5`. Hera +> currently publishes releases to both `hera` and `hera-workflows` for backwards compatibility purposes. ### Optional dependencies diff --git a/docs/README.md b/docs/README.md deleted file mode 100644 index 6d636668c..000000000 --- a/docs/README.md +++ /dev/null @@ -1,180 +0,0 @@ -# Hera - - - -Hera is a Python framework for constructing and submitting Argo Workflows. The main goal of Hera is to make the Argo -ecosystem accessible by simplifying workflow construction and submission. - -[See the Quick Start guide](https://hera.readthedocs.io/en/stable/walk-through/quick-start/) to start using Hera to -orchestrate your Argo Workflows! - -```text -The Argo was constructed by the shipwright Argus, -and its crew were specially protected by the goddess Hera. -``` - -### PyPi stats - -[![Pypi](https://img.shields.io/pypi/v/hera.svg)](https://pypi.python.org/pypi/hera) -[![Versions](https://img.shields.io/pypi/pyversions/hera.svg)](https://github.com/argoproj-labs/hera) - -[![Downloads](https://static.pepy.tech/badge/hera)](https://pepy.tech/project/hera) -[![Downloads/month](https://static.pepy.tech/badge/hera/month)](https://pepy.tech/project/hera) -[![Downloads/week](https://static.pepy.tech/badge/hera/week)](https://pepy.tech/project/hera) - -### Repo information - -[![License: Apache-2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/license/apache-2-0/) -[![CICD](https://github.com/argoproj-labs/hera/actions/workflows/cicd.yaml/badge.svg)](https://github.com/argoproj-labs/hera/actions/workflows/cicd.yaml) -[![Docs](https://readthedocs.org/projects/hera/badge/?version=latest)](https://hera.readthedocs.io/en/latest/?badge=latest) -[![codecov](https://codecov.io/gh/argoproj-labs/hera/branch/main/graph/badge.svg?token=x4tvsQRKXP)](https://codecov.io/gh/argoproj-labs/hera) - -#### Explore the code - -[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/argoproj-labs/hera) - -[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/argoproj-labs/hera) - -## Hera at a glance - -### Steps diamond - -```python -from hera.workflows import Steps, Workflow, script - - -@script() -def echo(message: str): - print(message) - - -with Workflow( - generate_name="single-script-", - entrypoint="steps", -) as w: - with Steps(name="steps") as s: - echo(name="A", arguments={"message": "I'm a step"}) - with s.parallel(): - echo(name="B", arguments={"message": "We're steps"}) - echo(name="C", arguments={"message": "in parallel!"}) - echo(name="D", arguments={"message": "I'm another step!"}) - -w.create() -``` - -### DAG diamond - -```python -from hera.workflows import DAG, Workflow, script - - -@script() -def echo(message: str): - print(message) - - -with Workflow( - generate_name="dag-diamond-", - entrypoint="diamond", -) as w: - with DAG(name="diamond"): - A = echo(name="A", arguments={"message": "A"}) - B = echo(name="B", arguments={"message": "B"}) - C = echo(name="C", arguments={"message": "C"}) - D = echo(name="D", arguments={"message": "D"}) - A >> [B, C] >> D - -w.create() -``` - -See the [examples](./examples/workflows-examples.md) for a collection of Argo workflow construction and submission via -Hera! - -## Requirements - -Hera requires an Argo server to be deployed to a Kubernetes cluster. Currently, Hera assumes that the Argo server sits -behind an authentication layer that can authenticate workflow submission requests by using the Bearer token on the -request. To learn how to deploy Argo to your own Kubernetes cluster you can follow -the [Argo Workflows](https://argoproj.github.io/argo-workflows/quick-start/) guide! - -Another option for workflow submission without the authentication layer is using port forwarding to your Argo server -deployment and submitting workflows to `localhost:2746` (2746 is the default, but you are free to change it). Please -refer to the documentation of [Argo Workflows](https://argoproj.github.io/argo-workflows/quick-start/) to see the -command for port forward! - -> **Note** Since the deprecation of tokens being automatically created for ServiceAccounts and Argo using Bearer tokens -> in place, it is necessary to use `--auth=server` and/or `--auth=client` when setting up Argo Workflows on Kubernetes -> v1.24+ in order for hera to communicate to the Argo Server. - -### Authenticating in Hera - -There are a few ways to authenticate in Hera - read more in the -[authentication walk through](https://hera.readthedocs.io/en/stable/walk-through/authentication/) - for now, with the -`argo` cli tool installed, this example will get you up and running: - -```py -from hera.workflows import Workflow, Container -from hera.shared import global_config -from hera.auth import ArgoCLITokenGenerator - -global_config.host = "http://localhost:2746" -global_config.token = ArgoCLITokenGenerator - -with Workflow(generate_name="local-test-", entrypoint="c") as w: - Container(name="c", image="docker/whalesay", command=["cowsay", "hello"]) - -w.create() -``` - -## Installation - -> **Note** Hera went through a name change - from `hera-workflows` to `hera`. This is reflected in the published Python -> package. If you'd like to install versions prior to `5.0.0`, you have to use `hera-workflows`. Hera currently -> publishes releases to both `hera` and `hera-workflows` for backwards compatibility purposes. - -| Source | Command | -|----------------------------------------------------------|------------------------------------------------------------------------------------------------------| -| [PyPi](https://pypi.org/project/hera/) | `pip install hera` | -| [PyPi](https://pypi.org/project/hera-workflows/) | `pip install hera-workflows` | -| [GitHub repo](https://github.com/argoproj-labs/hera) | `python -m pip install git+https://github.com/argoproj-labs/hera --ignore-installed; pip install .` | - -### Optional dependencies - -#### `yaml` - -- Install via `hera[yaml]` -- [PyYAML](https://pypi.org/project/PyYAML/) is required for the `yaml` output format, which is accessible via - `hera.workflows.Workflow.to_yaml(*args, **kwargs)`. This enables GitOps practices and easier debugging. - -#### `cli` - -- Install via `hera[cli]`. The `[cli]` option installs the extra dependency [Cappa](https://github.com/DanCardin/cappa) - required for the CLI -- The CLI aims to enable GitOps practices, - easier debugging, and a more seamless experience with Argo Workflows. -- **_The CLI is an experimental feature and subject to change!_** At the moment it only supports generating YAML files - from workflows via `hera generate yaml`. See `hera generate yaml --help` for more information. - -#### `experimental` - - Install via `hera[experimental]`. The `[experimental]` option adds dependencies required for experimental features that have not yet graduated into stable features. - -## Presentations - -- [KubeCon/ArgoCon EU 2024 - Orchestrating Python Functions Natively in Argo Using Hera](https://www.youtube.com/watch?v=4G3Q6VMBvfI&list=PLj6h78yzYM2NA4NbSC6_mQNza2r3WV87h&index=4) -- [CNCF TAG App-Delivery @ KubeCon NA 2023 - Automating the Deployment of Data Workloads to Kubernetes with ArgoCD, Argo Workflows, and Hera](https://www.youtube.com/watch?v=NZCmYRVziGY&t=12481s&ab_channel=CNCFTAGAppDelivery) -- [KubeCon/ArgoCon NA 2023 - How to Train an LLM with Argo Workflows and Hera](https://www.youtube.com/watch?v=nRYf3GkKpss&ab_channel=CNCF%5BCloudNativeComputingFoundation%5D) - - [Featured code](https://github.com/flaviuvadan/kubecon_na_23_llama2_finetune) -- [KubeCon/ArgoCon EU 2023 - Scaling gene therapy with Argo Workflows and Hera](https://www.youtube.com/watch?v=h2TEw8kd1Ds) -- [DoKC Town Hall #2 - Unsticking ourselves from Glue - Migrating PayIt's Data Pipelines to Argo Workflows and Hera](https://youtu.be/sSLFVIIEKcE?t=2088) -- [Argo Workflows and Events Community Meeting 15 June 2022 - Hera project update](https://youtu.be/sdkBDPOdQ-g?t=231) -- [Argo Workflows and Events Community Meeting 20 Oct 2021 - Hera introductory presentation](https://youtu.be/QETfzfVV-GY?t=181) - -## Blogs - -- [Data Validation with Great Expectations and Argo Workflows](https://towardsdatascience.com/data-validation-with-great-expectations-and-argo-workflows-b8e3e2da2fcc) -- [Hera introduction and motivation](https://www.dynotx.com/hera-the-missing-argo-workflows-python-sdk/) -- [Dyno is scaling gene therapy research with cloud-native tools like Argo Workflows and Hera](https://www.dynotx.com/argo-workflows-hera/) - -## Contributing - -See the [contributing guide](./CONTRIBUTING.md)! diff --git a/docs/README.md b/docs/README.md new file mode 120000 index 000000000..32d46ee88 --- /dev/null +++ b/docs/README.md @@ -0,0 +1 @@ +../README.md \ No newline at end of file diff --git a/docs/contributing/history.md b/docs/contributing/history.md index 89a9b4dcb..6f737a9b7 100644 --- a/docs/contributing/history.md +++ b/docs/contributing/history.md @@ -12,8 +12,9 @@ There have been other libraries available for structuring and submitting Argo Wo While the aforementioned libraries provided amazing functionality for Argo workflow construction and submission, they required an advanced understanding of Argo concepts. When [Dyno Therapeutics](https://dynotx.com) started using Argo Workflows, it was challenging to construct and submit experimental machine learning workflows. Scientists and engineers -at [Dyno Therapeutics](https://dynotx.com) used a lot of time for workflow definition rather than the implementation of -the atomic unit of execution - the Python function - that performed, for instance, model training. +were using a lot of time for workflow definition rather than the implementation of the atomic unit of execution - the +Python function - that performed, for instance, model training. Hera was created to help focus on the unit of execution, +and was released as an open-source library for the benefit of Python Argo Workflows users. Hera presents an intuitive Python interface to the underlying API of Argo, with custom classes making use of context managers and callables, empowering users to focus on their own executable payloads rather than workflow setup. @@ -146,8 +147,6 @@ class DagDiamond(Workflow): ## Hera V5 vs V4 -_Reserving here for Bloomberg history with Argo/Hera._ - Hera v5 is a major release that introduces breaking changes from v4. The main reason for this is that v5 is a complete rewrite of the library, and is now based on the OpenAPI specification of Argo Workflows. This allows us to provide a more intuitive interface to the Argo API, while also providing full feature parity with Argo Workflows. This means that diff --git a/docs/user-guides/expr.md b/docs/user-guides/expr.md index ee29e2260..ff485568e 100644 --- a/docs/user-guides/expr.md +++ b/docs/user-guides/expr.md @@ -1,13 +1,17 @@ -# Hera Python -> expr transpiler +# Python → expr transpiler -[**Expr**](https://github.com/antonmedv/expr/blob/master/docs/Language-Definition.md) is an expression evaluation language used by [**Argo**](https://argoproj.github.io/argo-workflows/variables/#expression). +[Expr](https://expr-lang.org/) is a Go-centric expression language used by +[Argo Workflows](https://argoproj.github.io/argo-workflows/variables/#expression). -Hera provides an easy way to construct `expr` expressions in `Python`. It supports the full language definition of `expr` including the enhancements added by `Argo`. +Hera provides an easy way to construct `expr` expressions in Python. It supports the full language definition of `expr` +including the enhancements added by Argo. ## Usage -The recommended way of using the `hera.expr` module is to construct the expression in Python. Once your expressions is ready to be used, -you may call `str()` to convert it to an appropriate `expr` expression. `hera` also supports formatting expressions such that they are surrounded by braces which is useful in Argo when substituting variables.. You can do this via Python string format literals and by adding `$` as a format string. +The recommended way of using the `hera.expr` module is to construct the expression in Python. Once your expressions is +ready to be used, you may call `str()` to convert it to an appropriate `expr` expression. Hera also supports +formatting expressions such that they are surrounded by braces which is useful in Argo when substituting variables. You +can do this via Python string format literals and by adding `$` as a format string. Example: diff --git a/docs/user-guides/script-annotations.md b/docs/user-guides/script-annotations.md index e2a844c36..b897f77e3 100644 --- a/docs/user-guides/script-annotations.md +++ b/docs/user-guides/script-annotations.md @@ -1,9 +1,9 @@ # Script Annotations -Annotation syntax is an experimental feature using `typing.Annotated` for `Parameter`s and `Artifact`s to declare inputs -and outputs for functions decorated as `scripts`. They use `Annotated` as the type in the function parameters and allow -us to simplify writing scripts with parameters and artifacts that require additional fields such as a `description` or -alternative `name`. +Annotation syntax is an experimental feature that uses `typing.Annotated` to declare `Parameters` and `Artifacts` as +metadata on the input and output types of a `script` function. This simplifies script functions with parameters and +artifacts that require additional fields such as a `description`, and allows Hera to automatically infer fields such as +`name` and `path`. This feature must be enabled by setting the `experimental_feature` flag `script_annotations` on the global config. @@ -29,8 +29,8 @@ def echo_all(an_int=1, a_bool=True, a_string="a"): print(a_string) ``` -Notice how the `name` and `default` values are duplicated for each `Parameter`. Using annotations, we can rewrite this -as: +Notice how the `name` and `default` values are duplicated for each `Parameter` as Python function parameters. Using +annotations, we can rewrite this as: ```python @script() @@ -44,13 +44,13 @@ def echo_all( print(a_string) ``` -The fields allowed in the `Parameter` annotations are: `name`, `enum`, and `description`. +The fields allowed in the `Parameter` annotations are: `name`, `enum`, and `description`, `name` will be set to the +variable name if not provided (when exporting to YAML or viewing in the Argo UI, the `name` variable will be used). ## Artifacts > Note: `Artifact` annotations are only supported when used with the `RunnerScriptConstructor`. - The feature is even more powerful for `Artifact`s. In Hera we are currently able to specify `Artifact`s in `inputs`, but the given path is not programmatically linked to the code within the function unless defined outside the scope of the function: @@ -70,73 +70,84 @@ def read_artifact(): print(a_file.read()) ``` -By using annotations we can avoid repeating the `path` of the file, and the function can use the variable directly as a -`Path` object, with its value already set to the given path: +By using annotations we can avoid repeating the `path` of the file, and even let let Hera automatically infer the +Artifact's name and create a path for us! (We can still set a custom name and path if we want.) The function can then +use the variable directly as a `Path` object: ```python @script(constructor="runner") -def read_artifact(an_artifact: Annotated[Path, Artifact(name="my-artifact", path="/tmp/file")]): +def read_artifact(an_artifact: Annotated[Path, Artifact(name="my-artifact-name", path="/tmp/my-custom-file-path")]): print(an_artifact.read_text()) ``` -The fields allowed in the `Artifact` annotations are: `name`, `path`, and `loader`. +The fields allowed in the `Artifact` annotations are: `name`, `path`, and `loader`. You are also able to use artifact +repository types such as `S3Artifact` (which are subclasses of `Artifact`) to first fetch the artifact from storage and +mount it to the container at the inferred path (or your custom path). ## Artifact Loaders -In case you want to load an object directly from the `path` of the `Artifact`, we allow two types of loaders besides the -default `Path` behaviour used when no loader is specified. The `ArtifactLoader` enum provides `file` and `json` loaders. +Artifact loaders specify how the Hera Runner should load the Artifact into the Python variable. There are three +different ways that the Hera Runner can set the variable: as the Path to the Artifact, as the string contents of the +Artifact, or as the deserialized JSON object stored in the Artifact. ### `None` loader -With `None` set as the loader (which is by default) in the Artifact annotation, the `path` attribute of `Artifact` is -extracted and used to provide a `pathlib.Path` object for the given argument, which can be used directly in the function -body. The following example is the same as above except for explicitly setting the loader to `None`: + +With `None` set as the loader (which is by default) in the Artifact annotation, the function parameter must be of `Path` +type. The `path` attribute of the `Artifact` is extracted and used to provide the `pathlib.Path` object for the given +argument, which can be used directly in the function body. The following example is the same as above except for +explicitly setting the loader to `None`, and letting Hera infer the name and path for us: ```python @script(constructor="runner") -def read_artifact( - an_artifact: Annotated[Path, Artifact(name="my-artifact", path="/tmp/file", loader=None)] -): +def read_artifact(an_artifact: Annotated[Path, Artifact(loader=None)]): print(an_artifact.read_text()) ``` ### `file` loader -When the loader is set to `file`, the function parameter type should be `str`, and will contain the contents string -representation of the file stored at `path` (essentially performing `path.read_text()` automatically): +When the loader is set to `file`, the function parameter type must be of `str` type. The variable will then contain the +contents string representation of the file stored at `path` (essentially performing `path.read_text()` automatically): ```python @script(constructor="runner") -def read_artifact( - an_artifact: Annotated[str, Artifact(name="my-artifact", path="/tmp/file", loader=ArtifactLoader.file)] -) -> str: - return an_artifact +def read_artifact(a_file_artifact: Annotated[str, Artifact(loader=ArtifactLoader.file)]) -> str: + return a_file_artifact ``` -This loads the contents of the file at `"/tmp/file"` to the argument `an_artifact` and subsequently can be used as a -string inside the function. +This loads the contents of the file to the argument `a_file_artifact` and subsequently can be used as a string inside the +function. ### `json` loader When the loader is set to `json`, the contents of the file at `path` are read and parsed to a dictionary via `json.load` -(essentially performing `json.load(path.open())` automatically). By specifying a Pydantic type, this dictionary can even -be automatically parsed to that type: +(essentially performing `json.load(path.open())` automatically). + +```python +@script(constructor="runner") +def read_dict_artifact(dict_artifact: Annotated[dict, Artifact(loader=ArtifactLoader.json)]) -> str: + return dict_artifact["my-key"] +``` + +A dictionary artifact would have no validation on its contents, so having safe code relies on you knowing or manually +validating the keys that exist in it. Instead, by specifying a Pydantic type, the dictionary can be automatically +validated and parsed to that type: ```python class MyArtifact(BaseModel): - a = "a" - b = "b" + a = "hello " + b = "world" @script(constructor="runner") -def read_artifact( - an_artifact: Annotated[MyArtifact, Artifact(name="my-artifact", path="/tmp/file", loader=ArtifactLoader.json)] -) -> str: - return an_artifact.a + an_artifact.b +def read_artifact(my_artifact: Annotated[MyArtifact, Artifact(loader=ArtifactLoader.json)]) -> str: + return my_artifact.a + my_artifact.b ``` -Here, we have a json representation of `MyArtifact` such as `{"a": "hello ", "b": "world"}` stored at `"/tmp/file"`. We -can load it with `ArtifactLoader.json` and then use `an_artifact` as an instance of `MyArtifact` inside the function, so -the function will return `"hello world"`. +Under the hood, this function receives an Artifact with a JSON representation of `MyArtifact`, such as +`{"a": "hello ", "b": "world"}`. We can tell Hera to `json.load` it by setting the `loader` to `ArtifactLoader.json`, +and as the type of `my_artifact` is a `BaseModel` subclass, Hera will try to create an object from the dictionary. Then +we can use `my_artifact` as normal Python inside the function, so the function will return `"hello world"`, which will +be printed to stdout. ### Function parameter name aliasing @@ -158,6 +169,7 @@ the function should return a value or tuple. An example can be seen [here](../examples/workflows/experimental/script_annotations_outputs.md). For a simple hello world output artifact example we currently have: + ```python @script(outputs=Artifact(name="hello-artifact", path="/tmp/hello_world.txt")) def hello_world(): @@ -166,13 +178,14 @@ def hello_world(): ``` The new approach allows us to avoid duplication of the path, which is now optional, and results in more readable code: + ```python @script() def hello_world() -> Annotated[str, Artifact(name="hello-artifact")]: return "Hello, world!" ``` -For `Parameter`s we have a similar syntax: +For `Parameter` we have a similar syntax: ```python @script() @@ -181,8 +194,8 @@ def hello_world() -> Annotated[str, Parameter(name="hello-param")]: ``` The returned values will be automatically saved in files within the Argo container according to this schema: -* `/hera/outputs/parameters/` -* `/hera/outputs/artifacts/` +* `/tmp/hera-outputs/parameters/` +* `/tmp/hera-outputs/artifacts/` These outputs are also exposed in the `outputs` section of the template in YAML. @@ -203,33 +216,40 @@ def hello_world() -> Annotated[str, Parameter(name="hello-param", value_from={"p return "Hello, world!" ``` -For multiple outputs, the return type should be a `Tuple` of arbitrary Pydantic types with individual +For multiple outputs, the return type should be a `Tuple` of Pydantic types with individual `Parameter`/`Artifact` annotations, and the function must return a tuple from the function matching these types: + ```python @script() def func(...) -> Tuple[ - Annotated[arbitrary_pydantic_type_a, Artifact], - Annotated[arbitrary_pydantic_type_b, Parameter], - Annotated[arbitrary_pydantic_type_c, Parameter], - ...]: + Annotated[pydantic_type_a, Artifact(name="a", ...)], + Annotated[pydantic_type_b, Parameter(name="b", ...)], + Annotated[pydantic_type_c, Parameter(name="c", ...)], +]: return output_a, output_b, output_c ``` +You may prefer to use the [Script Runner IO](script-runner-io.md#script-outputs-using-output) classes instead to avoid +long return Tuples, as return values can be set by name, rather than position. + ### Input-Output function parameters -Hera also allows output `Parameter`/`Artifact`s as part of the function signature when specified as a `Path` type, -allowing users to write to the path as an output, without needing an explicit return. They require an additional field -`output=True` to distinguish them from the input parameters and must have an underlying `Path` type (or another type -that will write to disk). +To allow users to write arbitrary `bytes` to disk, Hera allows `Parameter`/`Artifact` output to be declared as part of +the function inputs when specified as a `Path` type, allowing users to write their output to the path, rather than using +a return value. They require an additional field `output=True` to distinguish them from the input parameters and must +have an underlying `Path` type. You can use Input-Outputs alongside standard function-return outputs. ```python @script() -def func(..., output_param: Annotated[Path, Parameter(output=True, global_name="...", name="")]) -> Annotated[arbitrary_pydantic_type, OutputItem]: - output_param.write_text("...") - return output +def func( + output_param: Annotated[Path, Parameter(output=True, name="my-output")] +) -> Annotated[int, Parameter(name="my-other-output", ...)]: + output_param.write_bytes(...) + + return 42 ``` -The parent outputs directory, `/hera/outputs` by default, can be set by the user. This is done by adding: +The outputs directory, `/tmp/hera-outputs` by default, can be set by the user. This is done by adding: ```python global_config.set_class_defaults(RunnerScriptConstructor, outputs_directory="user/chosen/outputs") diff --git a/docs/user-guides/script-basics.md b/docs/user-guides/script-basics.md index a20f5de47..73c165ea2 100644 --- a/docs/user-guides/script-basics.md +++ b/docs/user-guides/script-basics.md @@ -37,28 +37,25 @@ with Workflow(generate_name="dag-diamond-", entrypoint="diamond") as w: A >> [B, C] >> D ``` -> **For advanced users**: the exact mechanism of the `script` decorator is to prepare a `Script` object within the -> decorator, so that when your function is invoked under a Hera context, the call is redirected to the `Script.__call__` -> function. This takes the kwargs of a `Step` or `Task` depending on whether the context manager is a `Steps` or a -> `DAG`. Under a Workflow itself, your function is not expected to take arguments, so the call will add the function as -> a template. +> **How it works**: the exact mechanism of the `script` decorator is to prepare a `Script` object within the decorator, +> so that when your function is invoked under a Hera context, the call is redirected to the `Script.__call__` function. +> This takes the kwargs of a `Step` or `Task` depending on whether the context manager is a `Steps` or a `DAG`. Under a +> Workflow itself, your function is not expected to take arguments, so the call will add the function as a template. -Alternatively, you can specify your DAG using `Task` directly: +This works as syntactic sugar for the alternative of using `Script` and `Task` directly to construct the Workflow and +DAG: ```py with Workflow(generate_name="dag-diamond-", entrypoint="diamond") as w: + echo_template = Script(name="echo", source=echo, image="python:3.11", resources=Resources(memory_request="5Gi")) with DAG(name="diamond"): - A = Task(name="A", source=echo, arguments={"message": "A"}) - B = Task(name="B", source=echo, arguments={"message": "B"}, when=f"{A.result == 'A'}") - C = Task(name="C", source=echo, arguments={"message": "C"}, when=f"{A.result != 'A'}") - D = Task(name="D", source=echo, arguments={"message": "D"}) + A = Task(name="A", source=echo_template, arguments={"message": "A"}) + B = Task(name="B", source=echo_template, arguments={"message": "B"}, when=f"{A.result == 'A'}") + C = Task(name="C", source=echo_template, arguments={"message": "C"}, when=f"{A.result != 'A'}") + D = Task(name="D", source=echo_template, arguments={"message": "D"}) A >> [B, C] >> D ``` -> **Note** in the `DAG` above, `D` will still run, even though `C` will be skipped. This is because of the `depends` logic -> resolving to `C.Succeeded || C.Skipped || C.Daemoned` due to Argo's default -> [depends logic](https://argoproj.github.io/argo-workflows/enhanced-depends-logic/#depends). - ## Script Constructors ### InlineScriptConstructor @@ -150,7 +147,7 @@ as the Hera Runner runs the function by referencing it as an entrypoint of your should be built from the source code package itself and its dependencies, so that the source code's functions, dependencies, and Hera itself are available to run. -A function can set its `constructor` to `"runner"` to use the `RunnerScriptConstructor`, or use the +A function can set its `constructor` to `"runner"` to use a default `RunnerScriptConstructor`, or use the `global_config.set_class_defaults` function to set it once for all script-decorated functions. We can write a script template function using Pydantic objects such as: diff --git a/docs/user-guides/script-runner-io.md b/docs/user-guides/script-runner-io.md index 076faced0..cc898bdea 100644 --- a/docs/user-guides/script-runner-io.md +++ b/docs/user-guides/script-runner-io.md @@ -1,7 +1,8 @@ # Script Runner IO -> ⚠️ The `RunnerInput` and `RunnerOutput` classes are deprecated since `v5.16.0`, please use `Input` and `Output` for -> equivalent functionality. They will be removed in `v5.17.0`. +> ⚠️ The `RunnerInput` and `RunnerOutput` classes have been renamed to align on the [decorators](decorators.md) feature +> and are deprecated since `v5.16.0`, please use `Input` and `Output` for equivalent functionality. The "Runner*" type +> aliases will be removed in `v5.17.0`. Hera provides the `Input` and `Output` Pydantic classes which can be used to more succinctly write your script function inputs and outputs, and requires use of the Hera Runner. Use of these classes also requires the diff --git a/docs/walk-through/pydantic-support.md b/docs/walk-through/pydantic-support.md index 733f289e7..825e6a9ac 100644 --- a/docs/walk-through/pydantic-support.md +++ b/docs/walk-through/pydantic-support.md @@ -2,8 +2,8 @@ ## The why and what -As Argo deals with YAML objects, which are actually a subset of json, Pydantic support is almost built-in to Hera -through Pydantic's serialization (to/from json) features. Using Pydantic objects (instead of dictionaries) in Script +As Argo deals with YAML objects, which are actually a subset of JSON, Pydantic support is practically built-in to Hera +through Pydantic's serialization (to/from JSON) features. Using Pydantic objects (instead of dictionaries) in Script templates makes them less error-prone, and easier to write! Using Pydantic classes yourself is as simple as inheriting from Pydantic's `BaseModel`. [Read more about Pydantic models here](https://docs.pydantic.dev/latest/usage/models/). @@ -15,5 +15,6 @@ de-serializing features of Pydantic when running on Argo. Your functions can ret to another `Step` as a string argument, and then de-serialized in another function. This flow can be seen in [the callable scripts example](../examples/workflows/scripts/callable_script.md). -The new experimental Runner IO feature provides a way to specify composite inputs using the class fields, which become the -template's inputs. Read more in the [Script Runner IO guide](../user-guides/script-runner-io.md). +The Script Runner IO experimental feature provides a way to specify template inputs and outputs using the class fields +of the special `Input` and `Output` classes in Hera. Read more in the +[Script Runner IO guide](../user-guides/script-runner-io.md). diff --git a/pyproject.toml b/pyproject.toml index b315edfab..d49999988 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -2,7 +2,7 @@ name = "hera" # project-name # The version is automatically substituted by the CI version = "0.0.0-dev" -description = "Hera is a Python framework for constructing and submitting Argo Workflows. The main goal of Hera is to make Argo Workflows more accessible by abstracting away some setup that is typically necessary for constructing Argo workflows." +description = "Hera makes Python code easy to orchestrate on Argo Workflows through native Python integrations. It lets you construct and submit your Workflows entirely in Python." authors = ["Flaviu Vadan ", "Sambhav Kothari ", "Elliot Gunton "] maintainers = ["Flaviu Vadan ", "Sambhav Kothari ", "Elliot Gunton "] license = "Apache-2.0" diff --git a/src/hera/__init__.py b/src/hera/__init__.py index 6de69dcf0..cb08458ce 100644 --- a/src/hera/__init__.py +++ b/src/hera/__init__.py @@ -1,4 +1,4 @@ -"""Hera is a Python framework for constructing and submitting Argo Workflows. +"""Hera makes Python code easy to orchestrate on Argo Workflows through native Python integrations. The main goal of Hera is to make the Argo ecosystem accessible by simplifying workflow construction and submission. Hera presents an intuitive Python interface