aws-batch-deep-dive

A hands-on dive into AWS Batch.

Prerequisites

AWS CLI and Credentials to an AWS Account with Permissions to create resources
Terraform
Docker
Python

Setup

Clone this repository
Deploy the infrastructure using Terraform
2.1 Navigate to the infrastructuresrc directory
```
cd src/infrastructure/
```
2.2 You need a VPC and Security Groups, take note of the VPC ID and Security Group IDs in order to pass them to the apply command. Use the prefix variable to create custom names for the resources.
```
terraform init
terraform plan -var prefix=<prefix> -var subnet_ids='["<subnet-1>", "<subnet-2>"]' -var vpc_id=<vpc-id> -out tfplan
```
2.3 Verify the plan and apply it if it looks good. Resources that are to be created:
- S3 Bucket that will be used as source and destination for the batch jobs
- ECR Repository to store the docker image
- Security Group for the ECS Task hosting the docker image/ Batch Job
- IAM Roles and Policies for the Batch Job and ECS Task
- AWS Batch Compute Environment
- AWS Batch Job Queue
- AWS Batch Job Definition Apply the plan:
```
terraform apply tfplan
```
2.4 Take a look at the resources that have been created in the AWS Console.
- AWS Batch
- ECR
- S3
Build the docker image and push it to ECR. 3.1 Login to ECR
```
aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com
```
url is in the form of <account-id>.dkr.ecr.<region>.amazonaws.com

3.2 Build Image
Change directory to the Dockerfile location. Dockerfile
```
cd ../python/single/
```
When on Mac M1, you need to use the --platform linux/amd64 flag to build the image for the correct architecture.
```
docker build --platform linux/amd64 -t <ecr-name> .
```
otherwise, just use
```
docker build -t <ecr-name> .
```
3.3 Tag Image
```
docker tag <ecr-name>:latest <ecr-uri>:latest
```
uri is in the form of <account-id>.dkr.ecr.<region>.amazonaws.com/<ecr-name>

3.4 Push Image
```
docker push <ecr-uri>:latest
```
Unpack the data and upload to S3
4.1. Change the directory
```
cd ../../data/     
```
4.2. Unpack the data with the command below or any other tool of your choice.
```
unzip data.zip
```
4.3. Upload the data to the S3 bucket that has been created by Terraform.
```
aws s3 cp data s3://<bucket_name>/source --recursive
```

5. Run the job

You will submit a job to AWS Batch that will run the python script on the data that you have uploaded to S3. After the job has finished, you will find the results in the destination folder in the S3 bucket. The source and destination will be passed as arguments to the job.

5.1 Command to submit the job:

aws batch submit-job --job-name <job-name> --job-queue <job-queue> --job-definition <job-definition> --container-overrides command='["python", "script.py"]' --container-overrides environment='[{name="BUCKET",value="<bucket-name>"},{name=PREFIX,value="source"},{name="OUTPUT_PREFIX",value="output"}]'

5.2 View the job in the AWS Batch Console
5.3. Check the ECS Cluster where the Job is being executed. ECS Console. A task will spin up and execute the job.
5.4. Check the S3 bucket for the results. S3 Console

Clean up

Change to the infrastructure directory and run the destroy command.

cd ../infrastructure/  
terraform plan -var prefix=<prefix> -var 'subnet_ids=["<subnet-1>", "<subnet-2>"]' -var vpc_id=<vpc-id> -destroy -out tfplan

Verify the plan and apply it if it looks good.

terraform apply tfplan

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aws-batch-deep-dive

Prerequisites

Setup

5. Run the job

Clean up

About

Releases

Packages

Contributors 2

Languages

NelsonIg/aws-batch-deep-dive

Folders and files

Latest commit

History

Repository files navigation

aws-batch-deep-dive

Prerequisites

Setup

5. Run the job

Clean up

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages