Skip to content

Commit

Permalink
add pipeline execution lc
Browse files Browse the repository at this point in the history
  • Loading branch information
jvntra committed Sep 17, 2024
1 parent ec4cdd2 commit b634348
Show file tree
Hide file tree
Showing 48 changed files with 2,604 additions and 0 deletions.
259 changes: 259 additions & 0 deletions 30_Pipeline_Execution_on_Cloud/AWS Overview.md

Large diffs are not rendered by default.

238 changes: 238 additions & 0 deletions 30_Pipeline_Execution_on_Cloud/Pipeline Execution.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pipeline Execution\n",
"\n",
"##### [Source](https://docs.aws.amazon.com/sagemaker/latest/dg/run-pipeline.html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After you’ve created a pipeline definition, you can submit it to SageMaker to start your execution. This notebook shows how to submit a pipeline, start an execution, examine the results of that execution, and delete your pipeline.\n",
"\n",
"**Note:** *Add these lines of code to the SageMaker [notebook used for pipeline definition](https://github.com/flatiron-school/DS-Deloitte-07062022-Architecting-Pipelines-with-AWS/blob/main/Pipeline%20Creation.ipynb).* "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Workflow\n",
"\n",
"- Submit the pipeline definition to the SageMaker Pipelines service to create a pipeline if it doesn't exist, or update the pipeline if it does. The role passed in is used by SageMaker Pipelines to create all of the jobs defined in the steps."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"pipeline.upsert(role_arn=role)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Start pipeline execution."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"execution = pipeline.start()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Describe the pipeline execution status to ensure that it has been created and started successfully."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"execution.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Wait for the pipeline execution to finish."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"execution.wait()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- List the execution steps and their status."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"execution.list_steps()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note:** *You can run additional executions of the pipeline by specifying different pipeline parameters to override the defaults.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### To override default parameters"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Create the pipeline execution. This starts another pipeline execution with the model approval status override set to \"Approved\". This means that the model package version generated by the RegisterModel step is automatically ready for deployment through CI/CD pipelines, such as with SageMaker Projects. For more info, see [Automate MLOps with SageMaker Projects.](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects.html)."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"execution = pipeline.start(\n",
" parameters=dict(\n",
" ModelApprovalStatus=\"Approved\",\n",
" )\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- After your pipeline execution is complete, download the resulting `evaluation.json` file from Amazon S3 to examine the report."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"evaluation_json = sagemaker.s3.S3Downloader.read_file(\"{}/evaluation.json\".format(\n",
" step_eval.arguments[\"ProcessingOutputConfig\"][\"Outputs\"][0][\"S3Output\"][\"S3Uri\"]\n",
"))\n",
"\n",
"json.loads(evaluation_json)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### To Stop and Delete Pipeline Execution\n",
"\n",
"When you're finished with your pipeline, you can stop any ongoing executions and delete the pipeline."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Stop the pipeline execution."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"execution.stop()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Delete the pipeline."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"pipeline.delete()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"This completes our walkthrough of how to execute a pre-defined pipeline on AWS! In [the next part of the lecture](https://github.com/flatiron-school/DS-Deloitte-07062022-Pipeline-Execution-on-AWS), we'll create an inference endpoint using an XGBoost model trained on batch-transformed abalone data to predict age from various features (i.e, sex, length, diameter, etc.). We'll then conclude with some remarks on model explainability.\n",
"\n",
"![](images/aws-model-inference-options-2.png)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading

0 comments on commit b634348

Please sign in to comment.