Region
- It consists of AZ'sAvailability Zones
- Each AZ is made up of one or more discreet data center which make up a region. Separated to each other but with 100 km range. (Usually 3, Min 2, Max 6)Edge Location
(Point of Presence) -- CDN for user as close as possible for low latency
- AWS has 216 poin of presence (205 edge location & 11 regional caches)
- IAM - Identity & Access Management
- IAM is a global service
- First account created in AWS is the root user
- Users are individual people using your AWS account
- User Group is a way of organizing multiple users under same group
- Policy is a json document attached to a group specifying what actions the group members can perform
- AWS follows the concept of least privileged policy
- Password Policy
- Minimum length
- Specific Character
- Password Reuse
- Changing own password
- Password Expiration
- MFA - Multi Factor Authentication
- Virual MFA Device - e.g. Google Authenticator, Authy
- U2F-Universal Second Factor Security Key - e.g. Yubikey by Yubico
- Hardware Key Fob MFA device - e.g. Gemalto
- AWS Management Console - Using username and password (plus MFA)
- AWS CLI - Using access keys (Access Key)
- AWS SDK - Using access keys (Access Key)
- AWS CloudShell
- IAM role gives permission to AWS services created by us to access our AWS account
- IAM roles are intended to be used by AWS services and not by users
- A IAM role can be created for the following -
- AWS services
- Another AWS account
- Web Identity
- SAML 2.0 federation
- Common Roles -
- EC2 instance role
- Lambda function role
- CloudFormation role
- Credential Reports (Account Level)
- To view credentials report for all users in your account
- Access Advisor (User Level)
- Access Advisor shows the services that a user can access and when those services were last accessed.
- Review this data to remove unused permissions.
- EC2 is used to provide
IAAS
in AWS
- Using user data, a user can run a script when the ec2 instance is booted for the first time
- User data require super user previledge so we need to log in using root user
-
There are many instance type available for EC2
-
There 7 categories of instances
- General Purpose
- Balance between --> Compute • Memory • Networking
- Compute Optimized
- Compute bound applications that benefit from high performance processors
- Memory Optimized
- Fast performance for workloads that process large data sets in memory
- Accelerated Computing
- Accelerated computing instances use hardware accelerators, or co-processors, to perform functions, such as floating point number calculations, graphics processing
- Storage Optimized
- Storage optimized instances are designed for workloads that require high, sequential read and write access to very large data sets on local storage
- Instance Feature
- Measuring Instance Performance
- General Purpose
-
For free tier, we will be using t2.micro
- t - instance class
- 2 - generation
- micro - size within the instance class
- Choose AMI - Amazon Machine Image
- Choose Instance Type
- Configure Instance
- Add Storage
- Add Tags
- Configure Security Group
- Provide n/w security in AWS (Firewall)
- They control how traffic is allowed into or out of our EC2 Instances
- Security Group are locked to a region
- 22 = SSH
- 21 = FTP
- 22 = SFTP
- 80 = HTTP
- 443 = HTTPS
- 3389 = RDP
- Following ways available:
-
Putty : PPK file
-
OpenSSH : PEM file
ssh -i <path-to-pem-file> ec2-user@<public-ip-of-ec2-instance>
-
EC2 instance connect : Web interface
-
Session Manager
-
EC2 Serial Console
-
- As told before, IAM roles can be attached to only AWS service and EC2 is a AWS service
- So we can create an IAM role with some policy attached (permissions) and attach that to our EC2 instance
- On-Demand Instances
- Reserved: (min: 1 yr, max: 3yr)
- Reserved Instances
- Convertible Reserved Instances
- Scheduled Reserved Instances
- Spot Instances
- Dedicated Hosts
- Dedicated Instances
- Image from EC2 instance
- EC2 Image Builder
- Used to automate the creation of Virtual Machines or container images
- Automate the creation, maintain, validate and test EC2 AMIs
- Can be run on a schedule (weekly, whenever packages are updated, etc…)
- Free service (only pay for the underlying resources)
-
An EBS volume is a network drive you can attach to your instances while they run
-
It allows your instances to persist data, even after their termination
-
They can only be mounted to one instance at a time (at the CCP level)
-
They are bound to a specific availability zone
-
Analogy: Think of them as a “network USB stick”
-
Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month
-
- Make a backup (snapshot) of your EBS volume at a point in time
- Not necessary to detach volume to do snapshot, but recommended
- Can copy snapshots across AZ or Region
- EC2 Instance Store are physical drives on your EC2 instance
- If you need a high-performance hardware disk, use EC2 Instance Store
- Better I/O performance
- EC2 Instance Store lose their storage if they’re stopped (ephemeral)
- Good for buffer / cache / scratch data / temporary content
- Risk of data loss if hardware fails
- Backups and Replication are your responsibility
- Managed NFS (network file system) that can be mounted on 100s of EC2
- EFS works with Linux EC2 instances in multi-AZ
- Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
- Storage class that is cost-optimized for files not accessed every day
- Up to 92% lower cost compared to EFS Standard
- EFS will automatically move your files to EFS-IA based on the last time they were accessed
- Enable EFS-IA with a Lifecycle Policy
- E.g., move files that are not accessed for 60 days to EFS-IA
- Transparent to the applications accessing EFS
- A fully managed, highly reliable, and scalable Windows native shared file system
- Built on Windows File Server
- Supports SMB protocol & Windows NTFS
- Integrated with Microsoft Active Directory
- Can be accessed from AWS or your on-premise infrastructure
- A fully managed, high-performance, scalable file storage for High Performance Computing (HPC)
- The name Lustre is derived from “Linux” and “cluster”
- Machine Learning, Analytics, Video Processing, Financial Modeling, …
- Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
-
Veritical Scaling
- Upgrade instance to a higher capacity instance
-
Horizontal Scaling = Elasticity
- Deploy application on multiple instances
- To achieve this in AWS:
- Loab Balancer
- Auto Scaling Group
- Deploying application on multiple Avalability Zones
- To achieve this in AWS:
- Loab Balancer on multi AZ's
- Auto Scaling Group
-
Application Load Balancer (ALB)
- For HTTP & HTTPS (Layer 7)
-
Network Load Balancer (NLB)
- For TCP, UDP, TLS (Layer 4)
-
Gateway Load Balancer (GLB)
- IP
- Load balancer help out in diverting traffic to all EC2 instances in your target group
- But in case of Load Balancer, we need to create a target group and all ec2 instances in advance
- ASG helps in automatically creatinf our EC2 instances based on demand
-
Manual Scaling: Update the size of an ASG manually
-
Dynamic Scaling: Respond to changing demand
-
Simple / Step Scaling
- When a CloudWatch alarm is triggered (example CPU > 70%), then add 2 units
- When a CloudWatch alarm is triggered (example CPU < 30%), then remove 1
-
Target Tracking Scaling
- Example: I want the average ASG CPU to stay at around 40%
-
Scheduled Scaling
- Anticipate a scaling based on known usage patterns
- Example: increase the min. capacity to 10 at 5 pm on Fridays
-
-
Predictive Scaling
- Uses Machine Learning to predict future traffic ahead of time
- Automatically provisions the right number of EC2 instances in advance
-
S3 is often reffered as Infinite Scalling
-
Files == Objects
-
Directories == Buckets
-
Buckets must have a globally unique name (across all regions all accounts)
-
Buckets are defined at the region level
-
S3 looks like a global service but buckets are created in a region
-
Objects have a Key
-
The key is the FULL path:
- s3://my-bucket/my_file.txt
- s3://my-bucket/my_folder1/another_folder/my_file.txt
-
The key is composed of prefix + object name
- s3://my-bucket/my_folder1/another_folder/my_file.txt
-
There’s no concept of “directories” within buckets
-
Object values are the content of the body:
- Max Object Size is 5TB (5000GB)
- If uploading more than 5GB, must use “multi-part upload”
- User based
- IAM policies - which API calls should be allowed for a specific user from IAM console
- Resource Based
- Bucket Policies - bucket wide rules from the S3 console - allows across account
- Object Access Control List (ACL) – finer grain
- Bucket Access Control List (ACL) – less common
- Bucket Policies - bucket wide rules from the S3 console - allows across account
- Disable block public access setting
- Add a new policy using the policy generator tool
-
Example: { "Version": "2012-10-17", "Id": "Policy1631099539396", "Statement": [ { "Sid": "Stmt1631099502552", "Effect": "Allow", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::daipayan-bucket/*" } ] }
- S3 can host static websites and have them accessible on the www
- The website URL will be:
- <bucket-name>.s3-website-<AWS-region>.amazonaws.com
- <bucket-name>.s3-website.<AWS-region>.amazonaws.com
- You can version your files in Amazon S3
- It is enabled at the bucket level
- Same key overwrite will increment the “version”: 1, 2, 3….
- It is best practice to version your buckets
- Protect against unintended deletes (ability to restore a version)
- Easy roll back to previous version
- Notes:
- Any file that is not versioned prior to enabling versioning will have version “null”
- Suspending versioning does not delete the previous versions
- This features allows S3 to save logs onto another S3 bucket
- CRR = Cross Region Replication
- SRR = Same Region Replication
- Must enable versioning in source and destination
- Buckets can be in different accounts
- Must give proper IAM permissions to S3
- CRR - Use cases: compliance, lower latency access, replication across accounts
- SRR – Use cases: log aggregation, live replication between production and test accounts
-
Amazon S3 Standard - General Purpose
- 99.99% Availability
- Used for frequently accessed data
- Low latency and high throughput
- Sustain 2 concurrent facility failures
- Use Cases: Big Data analytics, mobile & gaming applications, content distribution…
-
Amazon S3 Standard - Infrequent Access (IA)
- Suitable for data that is less frequently accessed, but requires rapid access when needed
- 99.9% Availability
- Lower cost compared to Amazon S3 Standard, but retrieval fee
- Sustain 2 concurrent facility failures
- Use Cases: As a data store for disaster recovery, backups…
-
Amazon S3 Intelligent Tiering
- 99.9% Availability
- Same low latency and high throughput performance of S3 Standard
- Cost-optimized by automatically moving objects between two access
tiers based on changing access patterns:
- Frequent access
- Infrequent access
- Resilient against events that impact an entire Availability Zone
-
Amazon S3 One Zone - Infrequent Access (IA)
- Same as IA but data is stored in a single AZ
- 99.5% Availability
- Low latency and high throughput performance
- Lower cost compared to S3-IA (by 20%)
- Use Cases: Storing secondary backup copies of on-premise data, or storing data you can recreate
-
Amazon Glacier
- Low cost object storage (in GB/month) meant for archiving / backup
- Data is retained for the longer term (years)
- Cost & Retrieval:
- Expedited (1 to 5 minutes)
- Standard (3 to 5 hours)
- Bulk (5 to 12 hours)
-
Amazon Glacier Deep Archive
- Cost & Retrieval:
- Standard (12 hours)
- Bulk (48 hours)
- Cost & Retrieval:
-
Amazon S3 Reduced Redundancy Storage (deprecated - omitted)
- It is possible to move objects between multiple storage classes
- This movement is made automated using lifecycle configuration
-
S3 Object Lock
- Adopt a WORM (Write Once Read Many) model
- Block an object version deletion for a specified amount of time
-
Glacier Vault Lock
- Adopt a WORM (Write Once Read Many) model
- Lock the policy for future edits (can no longer be changed)
- Helpful for compliance and data retention
-
Snowcone
- 8 TBs of usable storage
- AWS DataSync pre-installed
-
Snowball Edge
-
Snowball Edge Storage Optimized
- 52 vCPUs, 208 GiB of RAM
- Optional GPU (useful for video processing or machine learning)
- 39.5 TB usable storage
-
Snowball Edge Compute Optimized
- Up to 40 vCPUs, 80 GiB of RAM
- 80 TB usable storage
- Object storage clustering available
-
-
Snowmobile
- 100 PB of capacity
- Better than Snowball if you transfer more than 10 PB
- Data migration:
- Snowcone
- Snowball Edge
- Snowmobile
- Edge computing:
- Snowcone
- Snowball Edge
AWS OpsHub (a software you install on your computer / laptop) to manage your Snow Family Device
- Bridge between on-premise data and cloud data in S3
- Hybrid storage service to allow onpremises to seamlessly use the AWS cloud
- Managed DB service for DB use SQL as a query language
- Storage backed by EBS
- Examples:
- Postgres
- MySQL
- MariaDB
- Oracle
- Microsoft SQL Server
-
Aurora (AWS Proprietary database)
- Aurora is a proprietary technology from AWS (not open sourced)
- PostgreSQL and MySQL are both supported as Aurora DB
- Aurora is “
AWS cloud optimized
” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS - Aurora storage automatically grows in
increments of 10GB, up to 64 TB
Aurora costs more than RDS (20% more)
– but is more efficient- Not in the free tier
-
Read Replicas
- Scale the read workload of your DB
- Can create up to 5 Read Replicas
- Data is only written to the main DB
-
Multi-AZ
- Failover in case of AZ outage (high availability)
- Data is only read/written to the main database
Can only have 1 other AZ as failover
-
Multi-Region (Read Replicas)
- Disaster recovery in case of region issue
- Local performance for global reads
- Replication cost
- ElastiCache is to get managed Redis or Memcached
- Fully Managed Highly available with replication across 3 AZ
- NoSQL database
- Scales to massive workloads, distributed
“serverless”
database - Millions of requests per seconds, trillions of row, 100s of TB of storage
- Fast and consistent in performance
- Single-digit millisecond latency – low latency retrieval
- Integrated with IAM for security, authorization and administration
- Low cost and auto scaling capabilities
- Fully Managed in-memory cache for DynamoDB
- 10x performance improvement – singledigit millisecond latency to microseconds latency – when accessing your DynamoDB tables
- Secure, highly scalable & highly available
- Difference with ElastiCache at the CCP level: DAX is only used for DynamoDB, while ElastiCache can be used for other databases
- Redshift is based on PostgreSQL, but it’s
not used for OLTP
- It’s
OLAP – online analytical processing
(analytics and data warehousing) - Load data once every hour, not every second
- 10x better performance than other data warehouses, scale to PBs of data
- Columnar storage of data (instead of row based)
- Massively Parallel Query Execution (MPP), highly available
- Pay as you go based on the instances provisioned
- Has a SQL interface for performing the queries
- BI tools such as AWS Quicksight or Tableau integrate with it
- EMR helps creating
Hadoop clusters
(Big Data)
to analyze and process vast amount of data - The clusters can be made of hundreds of EC2 instances
- Also supports Apache Spark, HBase, Presto, Flink…
- EMR takes care of all the provisioning and configuration
- Auto-scaling and integrated with Spot instances
- Use cases: data processing, machine learning, web indexing, big data…
- Fully
serverless
database with SQL capabilities - Used to
query data in S3
& outputresults back to S3
- Pay per query
- Secured through IAM
- Use Case: one-time SQL queries, serverless queries on S3, log analytics
Serverless
machine learning-powered business intelligence service to create interactive dashboards- Fast, automatically scalable, embeddable, with per-session pricing
- Use cases:
- Business analytics
- Building visualizations
- Perform ad-hoc analysis
- Get business insights using data
- Integrated with RDS, Aurora, Athena, Redshift, S3…
- “
AWS-implementation” of MongoDB
(which is a NoSQL database) - Fully Managed, highly available with replication across 3 AZ
- Aurora storage automatically grows in increments of 10GB, up to 64 TB.
- Automatically scales to workloads with millions of requests per seconds
- Fully managed
Graph database
- A popular graph dataset would be a social network
- Highly available across 3 AZ, with up to 15 read replicas
- Build and run applications working with highly connected datasets – optimized for these complex and hard queries
- Can store up to billions of relations and query the graph with milliseconds latency
- Highly available with replications across multiple AZs
- Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines, social networking
- A
ledger
is a book recordingfinancial transactions
- Fully Managed, Serverless, High available, Replication across 3 AZ
- Used to review history of all the changes made to your application data over time
Immutable system
: no entry can be removed or modified, cryptographically verifiable- 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL
- Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules
- Amazon Managed Blockchain is a managed service to:
- Join public blockchain networks
- or Create your own scalable private network
- Compatible with the frameworks
Hyperledger Fabric
&Ethereum
- Managed
extract
,transform
, andload
(ETL) service - Useful to prepare and transform data for analytics
- Fully
serverless
service - Glue Data Catalog: catalog of datasets
- can be used by Athena, Redshift, EMR
- Quickly and securely
migrate databases
to AWS, resilient, self healing - The source database remains available during the migration
- Supports:
Homogeneous migrations
: e.g., Oracle to OracleHeterogeneous migrations
: e.g., Microsoft SQL Server to Aurora
- ECS = Elastic Container Service
- Launch Docker containers on AWS
- You
must provision & maintain the infrastructure
(the EC2 instances) - AWS takes care of starting / stopping containers
- Has integrations with the Application Load Balancer
- Launch Docker containers on AWS
- You
do not provision the infrastructure
(no EC2 instances to manage) – simpler! Serverless
offering- AWS just runs containers for you based on the CPU / RAM you need
- ECR = Elastic Container Registry
- Private
Docker Registry
on AWS - This is where you store your Docker images so they can be run by ECS or Fargate
Virtual functions
– no servers to manage (serverless
)!- Limited by time - short executions
- Scaling is automated!
- Integrated with the whole AWS suite of services
Event-Driven
: functions get invoked by AWS when needed- Integrated with many programming languages
- Easy monitoring through AWS CloudWatch
- Easy to get more resources per functions (up to 10GB of RAM!)
- Increasing RAM will also improve CPU and network!
-
Lambda Container Image
- The container image
must implement the Lambda Runtime API
ECS
/Fargate
is preferred for runningarbitrary Docker images
- The container image
Pay per call or per duration
- Fully managed service for developers to easily
create
,publish
,maintain
,monitor
, andsecure
APIs
Serverless
and scalable- Supports RESTful APIs and WebSocket APIs
- Support for security, user authentication, API throttling, API keys, monitoring...
- Fully managed batch processing at any scale
- Efficiently run 100,000s of computing batch jobs on AWS
- A “batch” job is a job with a start and an end (opposed to continuous)
- Batch
will dynamically launch EC2 instances or Spot Instances
- AWS Batch provisions the right amount of compute / memory
- You submit or schedule batch jobs and AWS Batch does the rest!
Batch jobs are defined as Docker images and run on ECS
- Helpful for cost optimizations and focusing less on the infrastructure
- Virtual servers, storage, databases, and networking
Simpler alternative to using EC2, RDS, ELB, EBS, Route 53…
- Great for people with
little cloud experience!
- Can setup notifications and monitoring of your Lightsail resources
- Use cases:
- Simple web applications (has templates for LAMP, Nginx, MEAN, Node.js…)
- Websites (templates for WordPress, Magento, Plesk, Joomla)
- Dev / Test environment
- Has high availability but no auto-scaling, limited AWS integrations
- A serverless service is a service where user
doesn't have to maintain a EC2 instance
- Examples of serverless services we have seen till now:
- DynamoDB
- Athena
- Amazon QuickSight
- AWS Glue
- Fargate
- Lambda
- Amazon API Gateway
- CloudFormation is a
declarative way of outlining your AWS Infrastructure
, for any resources (most of them are supported). - For example, within a CloudFormation template, you say:
- I want a security group
- I want two EC2 instances using this security group
- I want an S3 bucket
- I want a load balancer (ELB) in front of these machines
- Then CloudFormation creates those for you, in the right order, with the exact configuration that you specify
Infrastructure as code
- No resources are manually created, which is excellent for control
- Changes to the infrastructure are reviewed through code
- Cost
- Each resources within the stack is tagged with an identifier so you can easily see how much a stack costs you
- You can estimate the costs of your resources using the CloudFormation template
- Savings strategy: In Dev, you could automation deletion of templates at 5 PM and recreated at 8 AM, safely
CloudFormation Stack Designer
is visual representation of all resources that will be creted for a cloudformation template
- Define your
cloud infrastructure using a familiar programming language
:- JavaScript/TypeScript, Python, Java, and .NET
- The code is “compiled” into a CloudFormation template (JSON/YAML)
- You can therefore
deploy infrastructure and application runtime code together
- Great for Lambda functions
- Great for Docker containers in ECS / EKS
- Elastic Beanstalk is a
developer centric view of deploying an application on AWS
- It uses all the component’s we’ve seen before: EC2, ASG, ELB, RDS, etc…
- But it’s all in one view that’s easy to make sense of!
- We still have full control over the configuration
Beanstalk = Platform as a Service (PaaS)
- Beanstalk is free but you pay for the underlying instances
- We want to deploy our application automatically
- Works with EC2 Instances
- Works with On-Premises Servers
Hybrid service
- Servers / Instances must be
provisioned
andconfigured
ahead of time with theCodeDeploy Agent
- AWS offering for
code repository
, like GitHub
Compiles
source code, runtests
, andproduces packages
that are ready to be deployed
- AWS
CI-CD offering
, like Jenkins
Artifact Management System
, like Artifactory- Store & Retrieve code dependencies
Unified UI
to easily manage software development activities in one place
Online IDE
for coding and debugging code
- Helps you
manage your EC2 and On-Premises systems at scale
- Another
hybrid
AWS service
- To perform
server configuration
usingChef
&Puppet
, - It’s an
alternative to AWS SSM
& hence anhybrid
AWS service
Global DNS
- Great to route users to the closest deployment with least latency
- Great for disaster recovery strategies
- In AWS, the most common records are:
- www.google.com => 12.34.56.78 ==
A record
(IPv4) - www.google.com => 2001:0db8:85a3:0000:0000:8a2e:0370:7334 ==
AAAA IPv6
- search.google.com => www.google.com ==
CNAME
: hostname to hostname - example.com => AWS resource ==
Alias
(ex: ELB, CloudFront, S3, RDS, etc…)
- www.google.com => 12.34.56.78 ==
-
SIMPLE ROUTING POLICY
(No Health Check) -
WEIGHTED ROUTING POLICY
(Health Check) -
LATENCY ROUTING POLICY
(Health Check) -
FAILOVER ROUTING POLICY
(Health Check)
Global CDN
- Content Delivery Network- Replicate part of your application to AWS Edge Locations – decrease latency
- Cache common requests – improved user experience and decreased latency
DDoS protection
(because worldwide), integration with Shield,AWS WAF (Web Application Firewall)
CloudFront Origin
: Services from which CloudFront can cache:- S3 bucket (Enhanced security with CloudFront
Origin Access Identity (OAI)
) - Custom Origin (HTTP):
- Application Load Balancer
- EC2 instance
- S3 website
- Any HTTP backend you want
- S3 bucket (Enhanced security with CloudFront
Accelerate
globaluploads
&downloads
intoAmazon S3
- Improve global application
availability
andperformance
using theAWS global network
2 Anycast IP
are created for your application and traffic is sent through Edge Locations
Hybrid Cloud
: businesses that keep an onpremises infrastructure alongside a cloud infrastructure- Therefore, two ways of dealing with IT systems:
- One for the AWS cloud (using the AWS console, CLI, and AWS APIs)
- One for their on-premises infrastructure
- AWS Outposts are “server racks” that offers the same AWS infrastructure, services, APIs & tools to build your own applications on-premises just as in the cloud
- AWS will setup and manage “Outposts Racks” within your on-premises infrastructure and you can start leveraging AWS services on-premises
- You are
responsible
for the Outposts Rackphysical security
WaveLength Zones
are infrastructure deployments embedded within the telecommunications providers datacenters at theedge of the 5G networks
- Brings AWS services to the edge of the 5G networks
- Example: EC2, EBS, VPC…
- Ultra-low latency applications through 5G networks
- Traffic doesn’t leave the Communication Service Provider’s (CSP) network
- High-bandwidth and secure connection to the parent AWS Region
- No additional charges or service agreements
- Use cases: Smart Cities, ML-assisted diagnostics, Connected Vehicles, Interactive Live Video Streams, AR/VR, Real-time Gaming, …
- Oldest AWS offering (over 10 years old)
- Fully managed service (~
serverless
), use to decouple applications - Scales from 1 message per second to 10,000s per second
- Default retention of messages: 4 days, maximum of 14 days
- No limit to how many messages can be in the queue
- Messages are
deleted
after they’re read by consumers - Low latency (<10 ms on publish and receive)
- Consumers share the work to read messages & scale horizontally
- The “
event publishers
” only sends message to oneSNS topic
- As many “
event subscribers
” as we want to listen to theSNS topic notifications
- Each subscriber to the topic will get all the messages
- Up to 10,000,000 subscriptions per topic, 100,000 topics limit
- SNS Subscribers can be:
- HTTP / HTTPS (with delivery retries – how many times)
- Emails, SMS messages, Mobile Notifications
- SQS queues (fan-out pattern), Lambda Functions (write-your-own integration)
- For the exam:
Kinesis = real-time big data streaming
- Managed service to collect, process, and analyze real-time streaming data at any scale
- Too detailed for the Cloud Practitioner exam but good to know:
- Kinesis Data Streams: low latency streaming to ingest data at scale from hundreds of thousands of sources
- Kinesis Data Firehose: load streams into S3, Redshift, ElasticSearch, etc…
- Kinesis Data Analytics: perform real-time analytics on streams using SQL
- Kinesis Video Streams: monitor real-time video streams for analytics or ML
- SQS, SNS are “cloud-native” services, and they’re using proprietary protocols from AWS.
- Traditional applications running from on-premise may use open protocols
such as:
MQTT
,AMQP
,STOMP
,Openwire
,WSS
- When migrating to the cloud, instead of re-engineering the application to use SQS and SNS, we can use Amazon MQ
Amazon MQ = managed Apache ActiveMQ
- Amazon MQ doesn’t “scale” as much as SQS / SNS
- Amazon MQ runs on a dedicated machine (not serverless)
- Amazon MQ has both queue feature (~SQS) and topic features (~SNS)
CloudWatch
:Metrics
: monitor the performance of AWS services and billing metricsAlarms
: automate notification, perform EC2 action, notify to SNS based on metricLogs
: collect log files from EC2 instances, servers, Lambda functions…Events
(orEventBridge
): react to events in AWS, or trigger a rule on a schedule
CloudTrail
: audit API calls made within your AWS accountCloudTrail Insights
: automated analysis of your CloudTrail EventsX-Ray
: trace requests made through your distributed applicationsService Health Dashboard
: status of all AWS services across all regionsPersonal Health Dashboard
: AWS events that impact your infrastructureAmazon CodeGuru
: automated code reviews and application performance recommendations
- Metric is a variable to monitor (
CPUUtilization
,NetworkIn…
) - Metrics have timestamps
- Can create
CloudWatch dashboards
of metrics CloudWatch Billing
metric (only us-east-1 region
)- Important Metrics
EC2 instances
: CPU Utilization, Status Checks, Network (not RAM)- Default metrics every 5 minutes
- Option for Detailed Monitoring ($$$): metrics every 1 minute
EBS volumes
: Disk Read/WritesS3 buckets
: BucketSizeBytes, NumberOfObjects, AllRequestsBilling
: Total Estimated Charge (only in us-east-1)Service Limits
: how much you’ve been using a service APICustom metrics
: push your own metrics
- Alarms are used to trigger notifications for any metric
Alarms actions
:- Auto Scaling: increase or decrease EC2 instances “desired” count
- EC2 Actions: stop, terminate, reboot or recover an EC2 instance
- SNS notifications: send a notification into an SNS topic
- CloudWatch Logs
can collect log
from:Elastic Beanstalk
: collection of logs from applicationECS
: collection from containersAWS Lambda
: collection from function logsCloudTrail
based on filterCloudWatch log agents
: on EC2 machines or on-premises servers (hybrid
)Route53
: Log DNS queries
- Enables real-time monitoring of logs
- Adjustable CloudWatch Logs retention
- Type of events:
Schedule
: Cron jobs (scheduled scripts)Event Pattern
: Event rules to react to a service doing something
Target
: Trigger Lambda functions, send SQS/SNS messages…
- EventBridge is the
next evolution of CloudWatch Events
- Default event bus: generated by AWS services (CloudWatch Events)
Partner event bus
: receive events from SaaS service or applications (Zendesk, DataDog, Segment, Auth0…)Custom Event buses
: for your own applicationsSchema Registry
: model event schema- EventBridge has a different name to mark the new capabilities
- The CloudWatch Events name will be replaced with EventBridge
- Provides
governance
,compliance
andaudit
for your AWS Account CloudTrail is enabled by default!
Get an history of events / API calls
made within your AWS Account by:- Console
- SDK
- CLI
- AWS Services
- Can put logs from CloudTrail into CloudWatch Logs or S3
- A trail can be applied to All Regions (default) or a single Region.
- If a resource is deleted in AWS, investigate CloudTrail first!
Management Events
:- Operations that are performed on resources in your AWS account
- Examples:
- Configuring security (IAM AttachRolePolicy)
- Configuring rules for routing data (Amazon EC2 CreateSubnet)
- Setting up logging (AWS CloudTrail CreateTrail)
- By default, trails are configured to log management events.
- Can separate Read Events (that don’t modify resources) from Write Events (that may modify resources)
Data Events
:- By default, data events are not logged (because high volume operations)
- Amazon S3 object-level activity (ex: GetObject, DeleteObject, PutObject): can separate Read and Write Events
- AWS Lambda function execution activity (the Invoke API)
CloudTrail Insights Events
:- Enable CloudTrail Insights to
detect unusual activity
in your account:- inaccurate resource provisioning
- hitting service limits
- Bursts of AWS IAM actions
- Gaps in periodic maintenance activity
- CloudTrail Events Retention
- Events are stored for
90 days in CloudTrail
- To keep events
beyond
this period, log them toS3 and use Athena
- Events are stored for
- Enable CloudTrail Insights to
- Log formats differ across applications and log analysis is hard.
- Debugging: one big monolith “easy”, distributed services “hard”
- No common views of your entire architecture
- An
ML-powered service
for automated codereviews
and applicationperformance recommendations
- Provides two functionalities
CodeGuru Reviewer
: automated code reviews for static code analysis (development)CodeGuru Profiler
: visibility/recommendations about application performance during runtime (production)
- Shows
all regions, all services health
- Shows historical information for each day
- Has an RSS feed you can subscribe to
- AWS Personal Health Dashboard provides alerts and remediation guidance when AWS is experiencing events that may impact you.
- While the Service Health Dashboard displays the general status of
AWS services, Personal Health Dashboard gives you a
personalized view into the performance and availability of the AWS services underlying your AWS resources
. - The dashboard displays relevant and timely information to help you manage events in progress and provides proactive notification to help you plan for scheduled activities.
- VPC == Virtual Private Cloud
- Private network to deploy your resources in a
region
- Subnets allow you to partition your network inside your VPC (
availability zone
resource) - A
public subnet
is a subnet that is accessible from the internet - A
private subnet
is a subnet that is not accessible from the internet - To define access to the internet and between subnets, we use
Route Tables
- Internet Gateways helps our
VPC instances connect with the internet
Public Subnets
have a route to theinternet gateway
- NAT Gateways (AWS-managed) & NAT Instances (self-managed) allow your instances in your
Private Subnets
toaccess
theinternet
while remaining private
- A
firewall
whichcontrols traffic
to and fromsubnet
- Can have
ALLOW
andDENY
rules - Are attached at the Subnet level
- Rules only include IP addresses
Stateless
: Return traffic should have explicit permission
- A
firewall
thatcontrols traffic
to and from an ENI / anEC2
Instance - Can have only
ALLOW
rules - Rules include IP addresses and other security groups
Stateful
: Return traffic is allowed
- Capture information about IP traffic going into your interfaces:
- VPC Flow Logs
- Subnet Flow Logs
- Elastic Network Interface Flow Logs
- VPC Flow logs data can go to S3 / CloudWatch Logs
Connect two VPC
, privately using AWS’ network- Must
not have overlapping CIDR
(IP address range) - VPC Peering connection is
not transitive
(must be established for each VPC that need to communicate with one another)
- Endpoints allow you to
connect to AWS Services using a private network instead of the public www network
- This gives you enhanced security and lower latency to access AWS services
VPC Endpoint Gateway
: S3 & DynamoDBVPC Endpoint Interface
: the rest
- Connect an
on-premise VPN to AWS
- The connection is automatically encrypted
- Goes over the
public internet
- Establish a
physical connection between on-premises and AWS
- The connection is private, secure and fast
- Goes over a
private network
- Takes at least a month to establish
- For having transitive peering between thousands of VPC and on-premises, hub-and-spoke (
star
)connection
One single Gateway
to provide this functionality- Works with Direct Connect Gateway, VPN connections
AWS Shield Standard
: protects against DDOS attack for your website and applications, for all customers at no additional costsAWS Shield Advanced
: 24/7 premium DDoS protectionAWS WAF – Web Application Firewall
:- Filter specific requests based on rules
- Layer 7 (HTTP) protection
- Deploy on ALB, API Gateway, CloudFront
CloudFront (edge location cache)
andRoute 53 (DNS service)
:- Availability protection using global edge network
- Combined with AWS Shield, provides attack mitigation at the edge
- Be ready to scale – leverage AWS Auto Scaling
- Penetration Testing can be carried out for
8 services without AWS permission
- Amazon EC2 instances, NAT Gateways, and Elastic Load Balancers
- Amazon RDS
- Amazon CloudFront
- Amazon Aurora
- Amazon API Gateways
- AWS Lambda and Lambda Edge functions
- Amazon Lightsail resources
- Amazon Elastic Beanstalk environments
KMS = AWS manages the encryption keys for us
- Encryption Opt-in:
- EBS volumes: encrypt volumes
- S3 buckets: Server-side encryption of objects
- Redshift database: encryption of data
- RDS database: encryption of data
- EFS drives: encryption of data
- Encryption Automatically enabled:
- CloudTrail Logs
- S3 Glacier
- Storage Gateway
Dedicated hardware
provided to user for encryption- You
manage your own encryption keys
entirely (not AWS) - HSM == Hardware Security Module
Customer Managed CMK
:- Create, manage and used by the customer, can enable or disable
AWS managed CMK
:- Created, managed and used on the customer’s behalf by AWS
AWS owned CMK
:- Collection of CMKs that an AWS service owns and manages to use in multiple accounts
CloudHSM Keys
(custom keystore):- Keys generated from your own CloudHSM hardware device
- Let’s you easily provision, manage, and deploy
SSL/TLS Certificates
Keep secret
password or anythingMostly meant for integration with Amazon RDS
- Portal that
provides
customers with on-demand access toAWS compliance documentation and AWS agreements
- Intelligent Threat discovery to
Protect AWS Account
- Uses
Machine Learning
algorithms, anomaly detection, 3rd party data Can setup CloudWatch Event rules
to be notified in case of findings
- Automated
Security Assessments
forEC2 instances
- AWS Inspector Agent
must be installed
on OS in EC2 instances
- Helps with
auditing
andrecording compliance
of your AWS resources byrecording configurations changes
over time - Have to apply config rules
- Can send SNS notifications
Per-region service
butcan be aggregated
over multiple regions and account
- Amazon Macie is a fully managed data security and data privacy service
that uses
machine learning
and pattern matching todiscover and protect your sensitive data in AWS
Central security tool
to manage security across several AWS accounts and automate security checksIntegrated dashboards
showing current security and compliance status to quickly take actions
- Amazon Detective analyzes, investigates, and quickly identifies the
root cause of security issues
or suspicious activities (using ML and graphs)
Report suspected AWS resources used for abusive or illegal purposes
Actions that can be performed only by the root user:
Change account settings
(account name, email address, root user password, root user access keys)- View certain tax invoices
Close your AWS account
- Restore IAM user permissions
Change or cancel your AWS Support plan
Register as a seller in the Reserved Instance Marketplace
- Configure an Amazon S3 bucket to enable MFA
- Edit or delete an Amazon S3 bucket policy that includes an invalid VPC ID or VPC endpoint ID
- Sign up for GovCloud
Amazon Rekognition
: Face detection, labeling, celebrity recognitionAmazon Transcribe
: audio to textAmazon Polly
: text to audioAmazon Translate :
translationsAmazon Lex
: build conversational bots – chatbotsAmazon Connect
: cloud contact centerAmazon Comprehend
: natural language processingAmazon SageMaker
: Create and deploy ML modelsAmazon Forecast
: build highly accurate forecastsAmazon Kendra
: ML-powered document search engineAmazon Personalize
: real-time personalized recommendations
- Allows to
manage multiple AWS accounts
- The main account is the master account
- Cost Benefits:
Consolidated Billing
across all accounts - single payment method- Pricing benefits from aggregated usage (volume discount for EC2, S3…)
- Pooling of Reserved EC2 instances for optimal savings
- Restrict account privileges using
Service Control Policies (SCP)
Organizational Units (OU)
are other aws accounts who are under the master account
- Easy way to
set up and govern a secure and compliant multi-account AWS environment
based on best practices - AWS Control Tower
runs on top of AWS Organizations
- It automatically sets up AWS Organizations to organize accounts and implement SCPs (Service Control Policies)
Pay as you go
Save when you reserve
Pay less by using more
Pay less as AWS grows
- IAM
- VPC
- Consolidated Billing
- Elastic Beanstalk
- CloudFormation
- Auto Scaling Groups
- Reduce costs and improve performance by
recommending optimal AWS resources
for your workloads - Uses
Machine Learning
to analyze your resources configurations and their utilization CloudWatch metrics
- TCO == Total Cost of Ownership
- Provides a detailed report for presentation
comparing benefit of using AWS cloud services over using on premise datacenter
Estimate the cost for your architecture solution
Dashboard displaying detailed bill for all your usage
- Use cost allocation tags to
track your AWS costs on a detailed level
- Types:
- AWS generated tags
- User-defined tags
- The AWS Cost & Usage Report contains the
most comprehensive set of AWS cost and usage data available
, including additional metadata about AWS services, pricing, and reservations (e.g., Amazon EC2 Reserved Instances (RIs))
- Provides helpful
cost and usage insights
Forecast
usage up to12 months
based on previous usage
- CloudWatch Billing Alarm is intended a simple alarm (
not as powerful as AWS Budgets
)
- Create budget and
send alarms when costs exceeds the budget or forecasted budget
3 types
of budgets:Usage
,Cost
,Reservation
- Analyze your AWS accounts and provides recommendations for:
- Cost optimization
- Performance
- Secutirty
- Fault Tolerance
- Service Limits
Full Trusted Advisor – Available for Business & Enterprise
support plans
Basic
: FreeDeveloper
: Business hours email accessBusiness
: 24x7 phone, email, and chat accessEnterprise
: Technical Account Manager, Concierge Support Team
- Enables you to
create temporary
,limited privileges credentials
to access your AWS resources
Identity for your Web and Mobile applications users (potentially millions)
- Same as
Microsoft Active Directory
to login to any machine using same credential
- To access
multiple accounts and 3rd-party business applications
Virtual Desktop
Infrastructure
Stream a desktop application to web browsers
- Allow to configure an instance type per application type (CPU, RAM, GPU)
- Create and run virtual reality (VR), augmented reality (AR), and 3D applications
- Can be used to quickly
create 3D models with animations
- AWS IoT Core allows you to easily
connect IoT devices to the AWS Cloud
Convert media files stored in S3
into media files in the formats required by consumer playback devices (phones etc..)
- Fully-managed service that
tests your web and mobile apps against desktop browsers, real mobile devices, and tablets
- Fully-managed service to centrally
manage and automate backups
across AWS services
- Quickly and easily
recover
yourphysical
,virtual
, and cloud-basedservers into AWS
- Continuous block-level replication for your servers
- 5 Pillars
Operational Excellence
Security
Reliability
Performance Efficiency
Cost Optimization
Free tool to review your architectures against the 5 pillars of Well-Architected Framework
and adopt architectural best practices