This repository contains my code solution to DeepLearning.AIs Practical Data Science On AWS Cloud Specialization.
- Lab 1: Register and visualize dataset
List and access the Women's Clothing Reviews dataset files hosted in an S3 bucket. Install and import AWS Data Wrangler. Create an AWS Glue Catalog database and list all Glue Catalog databases. Register dataset files with the AWS Glue Catalog. Write SQL queries to answer specific questions on your dataset and run your queries with Amazon Athena. Produce and select different plots and visualizations that address your questions.
- Lab 2: Detect data bias with Amazon SageMaker Clarify
Download and save raw unbalanced dataset. Analyze bias with open source Clarify. Balance the dataset. Analyze bias at scale with a Amazon SageMaker processing job and Clarify. Analyze bias reports before and after balancing the dataset.
- Lab 3: Train a model with Amazon SageMaker Autopilot
Dataset review. Configure the Autopilot job. Launch Autopilot job. Track Autopilot job progress. Feature engineering. Model training and tuning. Review all output. Deploy and test best candidate model.
- Lab 4: Train a text classifier using Amazon SageMaker BlazingText built-in algorithm
Prepare dataset. Train the model with Amazon SageMaker BlazingText. Deploy the model. Test the model.
- Lab 1: Feature transformation with Amazon SageMaker processing job and Feature Store
Configure the SageMaker Feature Store. Transform the dataset. Inspect the transformed data. Query the Feature Store.
- Lab 2: Train a review classifier with BERT and Amazon SageMaker
Configure dataset. Configure model hyper-parameters. Setup evaluation metrics, debugger and profiler. Train model. Analyze debugger results. Deploy and test the model.
- Lab 3: SageMaker pipelines to train a BERT-Based text classifier
Configure dataset and processing step. Configure training step. Configure model-evaluation step. Configure register model step. Create model for deployment step. Check accuracy condition step. Create and start pipeline. List pipeline artifacts. Approve and deploy model.
- Lab 1: Optimize models using Automatic Model Tuning
Configure dataset. Configure and run hyper-parameter tuning job. Evaluate the results.
- Lab 2: A/B testing, traffic shifting and autoscaling
Configure and create REST Enpoint with multiple variants. Test the model. Show the metrics for each variant. Shift all traffic to one variant. Configure one variant to autoscale.
- Lab 3: Data labeling and human-in-the-loop pipelines with Amazon Augmented AI (A2I)
Setup private workforce and Cognito pool. Create the Human Task UI using a Worker Task Template. Create a Flow Definition. Start and check the status of human loop. Verify the completion. View the labels and prepare data for training.