add pipeline execution lc

flatiron-school · Sep 17, 2024 · b634348 · b634348
1 parent ec4cdd2
commit b634348
Show file tree

Hide file tree

Showing 48 changed files with 2,604 additions and 0 deletions.
diff --git a/30_Pipeline_Execution_on_Cloud/AWS Overview.md b/30_Pipeline_Execution_on_Cloud/AWS Overview.md
diff --git a/30_Pipeline_Execution_on_Cloud/Pipeline Execution.ipynb b/30_Pipeline_Execution_on_Cloud/Pipeline Execution.ipynb
@@ -0,0 +1,238 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Pipeline Execution\n",
+    "\n",
+    "##### [Source](https://docs.aws.amazon.com/sagemaker/latest/dg/run-pipeline.html)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "After you’ve created a pipeline definition, you can submit it to SageMaker to start your execution. This notebook shows how to submit a pipeline, start an execution, examine the results of that execution, and delete your pipeline.\n",
+    "\n",
+    "**Note:** *Add these lines of code to the SageMaker [notebook used for pipeline definition](https://github.com/flatiron-school/DS-Deloitte-07062022-Architecting-Pipelines-with-AWS/blob/main/Pipeline%20Creation.ipynb).* "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Workflow\n",
+    "\n",
+    "- Submit the pipeline definition to the SageMaker Pipelines service to create a pipeline if it doesn't exist, or update the pipeline if it does. The role passed in is used by SageMaker Pipelines to create all of the jobs defined in the steps."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.upsert(role_arn=role)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- Start pipeline execution."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "execution = pipeline.start()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- Describe the pipeline execution status to ensure that it has been created and started successfully."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "execution.describe()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- Wait for the pipeline execution to finish."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "execution.wait()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- List the execution steps and their status."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "execution.list_steps()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Note:** *You can run additional executions of the pipeline by specifying different pipeline parameters to override the defaults.*"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### To override default parameters"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- Create the pipeline execution. This starts another pipeline execution with the model approval status override set to \"Approved\". This means that the model package version generated by the RegisterModel step is automatically ready for deployment through CI/CD pipelines, such as with SageMaker Projects. For more info, see [Automate MLOps with SageMaker Projects.](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "execution = pipeline.start(\n",
+    "    parameters=dict(\n",
+    "        ModelApprovalStatus=\"Approved\",\n",
+    "    )\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- After your pipeline execution is complete, download the resulting `evaluation.json` file from Amazon S3 to examine the report."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "evaluation_json = sagemaker.s3.S3Downloader.read_file(\"{}/evaluation.json\".format(\n",
+    "    step_eval.arguments[\"ProcessingOutputConfig\"][\"Outputs\"][0][\"S3Output\"][\"S3Uri\"]\n",
+    "))\n",
+    "\n",
+    "json.loads(evaluation_json)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### To Stop and Delete Pipeline Execution\n",
+    "\n",
+    "When you're finished with your pipeline, you can stop any ongoing executions and delete the pipeline."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- Stop the pipeline execution."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "execution.stop()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- Delete the pipeline."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.delete()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Conclusion\n",
+    "\n",
+    "This completes our walkthrough of how to execute a pre-defined pipeline on AWS! In [the next part of the lecture](https://github.com/flatiron-school/DS-Deloitte-07062022-Pipeline-Execution-on-AWS), we'll create an inference endpoint using an XGBoost model trained on batch-transformed abalone data to predict age from various features (i.e, sex, length, diameter, etc.). We'll then conclude with some remarks on model explainability.\n",
+    "\n",
+    "![](images/aws-model-inference-options-2.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}