Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 35 additions & 22 deletions docs/guides/artifacts/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,47 +12,60 @@ import { CTAButtons } from '@site/src/components/CTAButtons/CTAButtons.tsx';

<CTAButtons productLink="https://wandb.ai/wandb/arttest/artifacts/model/iv3_trained/5334ab69740f9dda4fed/lineage" colabLink="https://colab.research.google.com/github/wandb/examples/blob/master/colabs/wandb-artifacts/Pipeline_Versioning_with_W%26B_Artifacts.ipynb"/>

Use W&B Artifacts to track and version any serialized data as the inputs and outputs of your [W&B Runs](../runs/intro.md). For example, a model training run might take in a dataset as input and trained model as output. In addition to logging hyper-parameters and metadata to a run, you can use an artifact to log the dataset used to train the model as input and the resulting model checkpoints as outputs. You will always be able answer the question “what version of my dataset was this model trained on”.
Use W&B Artifacts to track and version data as the inputs and outputs of your [W&B Runs](../runs/intro.md). For example, a model training run might take in a dataset as input and produce a trained model as output. In addition to logging hyperparameters, metadata and metrics to a run, you can use an artifact to log, track and version the dataset used to train the model as input and another artifact for the resulting model checkpoints as outputs.

In summary, with W&B Artifacts, you can:
* [View where a model came from, including data it was trained on](./explore-and-traverse-an-artifact-graph.md).
* [Version every dataset change or model checkpoint](./create-a-new-artifact-version.md).
* [Easily reuse models and datasets across your team](./download-and-use-an-artifact.md).
## Use cases
You can use artifacts throughout your entire ML workflow as inputs and outputs of [runs](../runs/intro.md). You can use datasets, models, or even other artifacts as inputs for processing.

![](/images/artifacts/artifacts_landing_page2.png)

| Use Case | Input | Output |
|------------------------|-----------------------------|------------------------------|
| Model Training | Dataset (training and validation data) | Trained [Model](../models.md) |
| Dataset Pre-Processing | Dataset (raw data) | Dataset (pre-processed data) |
| Model Evaluation | Model + Dataset (test data) | [W&B Table](../tables/intro.md) |
| Model Optimization | Model | Optimized Model |

The diagram above demonstrates how you can use artifacts throughout your entire ML workflow; as inputs and outputs of [runs](../runs/intro.md).

## How it works
## Create an artifact

Create an artifact with four lines of code:
1. Create a [W&B run](../runs/intro.md).
1. Create a [W&B Run](../runs/intro.md).
2. Create an artifact object with the [`wandb.Artifact`](../../ref/python/artifact.md) API.
3. Add one or more files, such as a model file or dataset, to your artifact object.
3. Add one or more files, such as a model file or dataset, to your artifact object. In this example, you'll add a single file.
4. Log your artifact to W&B.


```python showLineNumbers
```python
run = wandb.init(project="artifacts-example", job_type="add-dataset")
artifact = wandb.Artifact(name="my_data", type="dataset")
artifact.add_dir(local_path="./dataset.h5") # Add dataset directory to artifact
artifact.add_file(local_path="./dataset.h5") # Add dataset file to artifact
run.log_artifact(artifact) # Logs the artifact version "my_data:v0"
```

:::tip
The preceding code snippet, and the [colab linked on this page](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/wandb-artifacts/Artifacts_Quickstart_with_W&B.ipynb), show how to track files by uploading them to W&B. See the [track external files](./track-external-files.md) page for information on how to add references to files or directories that are stored in external object storage (for example, in an Amazon S3 bucket).
See the [track external files](./track-external-files.md) page for information on how to add references to files or directories stored in external object storage, like an Amazon S3 bucket.
:::

## How to get started
## Download an artifact
Indicate the artifact you want to mark as input to your run with the [`use_artifact`](../../ref/python/run.md#use_artifact) method, which returns an artifact object:

Depending on your use case, explore the following resources to get started with W&B Artifacts:
```python
artifact = run.use_artifact("my_data:latest") #returns a run object using the "my_data" artifact
```

Then, use the returned object to download all contents of the artifact:

```python
datadir = artifact.download() #downloads the full "my_data" artifact to the default directory.
```

:::tip
You can pass a custom path into the `root` [parameter](../../ref/python/artifact.md) to download an artifact to a specific directory. For alternate ways to download artifacts and to see additional parameters, see the guide on [downloading and using artifacts](./download-and-use-an-artifact.md)
:::

* If this is your first time using W&B Artifacts, we recommend you go through the [Artifacts Colab notebook](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/wandb-artifacts/Artifacts_Quickstart_with_W&B.ipynb#scrollTo=fti9TCdjOfHT).
* Read the [artifacts walkthrough](./artifacts-walkthrough.md) for a step-by-step outline of the W&B Python SDK commands you could use to create, track, and use a dataset artifact.
* Explore this chapter to learn how to:
* [Construct an artifact](./construct-an-artifact.md) or a [new artifact version](./create-a-new-artifact-version.md)
* [Update an artifact](./update-an-artifact.md)
* [Download and use an artifact](./download-and-use-an-artifact.md).
* [Delete artifacts](./delete-artifacts.md).
* Explore the [Python SDK Artifact APIs](../../ref/python/artifact.md) and [Artifact CLI Reference Guide](../../ref/cli/wandb-artifact/README.md).
## Next steps
* Learn how to [version](./create-a-new-artifact-version.md), [update](./update-an-artifact.md), or [delete](./delete-artifacts.md) artifacts.
* Learn how to trigger downstream workflows in response to changes to your artifacts with [artifact automation](./project-scoped-automations.md).
* Learn about the [model registry](../model_registry/intro.md), a space that houses trained models.
* Explore the [Python SDK](../../ref/python/artifact.md) and [CLI](../../ref/cli/wandb-artifact/README.md) reference guides.
Binary file modified static/images/artifacts/artifacts_landing_page2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.