-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Vivek Soni1
committed
May 25, 2021
0 parents
commit 49424a7
Showing
15 changed files
with
3,603 additions
and
0 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# Setting up Primary and Secondary AWS MSK Clusters | ||
|
||
This document provides a step-by-step guidance on how to create and configure AWS MSK clusters in two different regions. | ||
|
||
![arch_diag](../images/arch_diag.png) | ||
|
||
On this page | ||
|
||
<!-- @import "[TOC]" {cmd="toc" depthFrom=2 depthTo=6 orderedList=false} --> | ||
|
||
<!-- code_chunk_output --> | ||
|
||
- [Before You Begin](#before-you-begin) | ||
- [Setting up the Primary MSK Cluster](#setting-up-the-primary-msk-cluster) | ||
- [Creating the MSK Cluster](#creating-the-msk-cluster) | ||
- [Prerequisite](#prerequisite) | ||
- [Provision MSK Cluster](#provision-msk-cluster) | ||
- [Provision EC2 Client Machine](#provision-ec2-client-machine) | ||
- [Configure EC2 Client Machine](#configure-ec2-client-machine) | ||
- [Configure VPC SG Inbound Rules](#configure-vpc-sg-inbound-rules) | ||
- [Setting up SASL SCRAM Authentication](#setting-up-sasl-scram-authentication) | ||
- [Setting up the Secondary MSK Cluster](#setting-up-the-secondary-msk-cluster) | ||
- [Connecting the Primary and Secondary AWS Regions](#connecting-the-primary-and-secondary-aws-regions) | ||
- [Next](#next) | ||
- [Resources](#resources) | ||
|
||
<!-- /code_chunk_output --> | ||
## Before You Begin | ||
|
||
|
||
* Configure your aws-cli with an IAM user having *Programmatic access* enabled. This user must have appropriate rights to manage the required services. | ||
|
||
|
||
|
||
For the purpose of this POC, you can assign your IAM user with the *PowerUserAccess* policy. | ||
|
||
> **_NOTE:_** The PowerUserAccess policy will allow can create and configure resources and services that support AWS aware application development. | ||
* Create a new IAM Role ```MSK_EC2_ROLE```. Assign the *PowerUserAccess* policy to this role | ||
|
||
|
||
|
||
## Setting up the Primary MSK Cluster | ||
|
||
> **_NOTE:_** Most of these steps are possible using the AWS Web Console. It's not mandatory to use the aws-cli for provisioning these services. | ||
### Creating the MSK Cluster | ||
Refer the getting started guide to set up AWS MSK cluster in the selected region | ||
https://docs.aws.amazon.com/msk/latest/developerguide/getting-started.html | ||
|
||
#### Prerequisite | ||
* Create a new VPC using the VPC Wizard and selecting the *VPC with a Single Public Subnet* option | ||
* Add additional subnets (minimum 3 az's) for other availability zones in this VPC. Refer the getting started guide for this process. | ||
* Make a note of the VPC ID, VPC CIDR range, and the associated default Security Group (SG) ID of the VPC | ||
|
||
#### Provision MSK Cluster | ||
Choose *Custom Create* option instead of *Quick Create* while creating the MSK cluster. Custom Create option will allow more control on the configurations. | ||
* Select the appropriate VPC, Availability Zones and Security Group as created in the prerequisite step | ||
* Under the Access control method section, select the Access control method as *SASL/SCRAM authentication* | ||
* Under the Monitoring section, in the *Amazon CloudWatch metrics for this cluster* subsection, select *Enhanced partition-level monitoring* | ||
* Under the Open monitoring with Prometheus, check the *Enable open monitoring with Prometheus* to enable prometheus | ||
* Leave rest of the settings on the page as is and click *Create Cluster* | ||
|
||
### Provision EC2 Client Machine | ||
Follow the steps as listed here for provisioning an EC2 instance | ||
https://docs.aws.amazon.com/efs/latest/ug/gs-step-one-create-ec2-resources.html | ||
|
||
While following the EC2 instance creation steps make sure to incorporate the below three instructions | ||
* Select the VPC and one of the Az as created in the prerequisite section | ||
* Assign the EC2 instance with ```MSK_EC2_ROLE``` IAM role as created above | ||
* Create a new security group for this VPC and name it *msk_client_sg* | ||
|
||
### Configure EC2 Client Machine | ||
Once provisioned, follow the below steps: | ||
* SSH into the instance | ||
* Install Java 8 by running | ||
|
||
```sudo yum install java-1.8.0``` | ||
* Download and extract Apache Kafka 2.7.1 binaries | ||
``` | ||
wget https://www.apache.org/dyn/closer.cgi?path=/kafka/2.7.1/kafka_2.13-2.7.1.tgz | ||
tar -xzf kafka_2.13-2.7.1.tgz | ||
rm kafka_2.13-2.7.1.tgz | ||
``` | ||
### Configure VPC SG Inbound Rules | ||
Add an inbound rules to the VPC default SG to allow *All traffic* from the *msk_client_sg* security group. | ||
Refer this link for steps: https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html | ||
### Setting up SASL SCRAM Authentication | ||
* Create a new KMS symmetric CMKs following the steps as documented here: | ||
https://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html#create-symmetric-cmk | ||
* Give your IAM user access to manage this key | ||
* Follow this link which describes the steps to configure SASL/SCRAM (Simple Authentication and Security Layer/ Salted Challenge Response Mechanism) authentication on AWS MSK | ||
https://docs.aws.amazon.com/msk/latest/developerguide/msk-password.html | ||
> **_NOTE:_** Use the KMS key you created above and not the DefaultEncryptionKey for encrypting the AmazonMSK_** secret | ||
* On the same link, the section *Connecting to your cluster with a username and password* describes the steps to authenticate your client and connect to MSK cluster for performing all kafka operations | ||
## Setting up the Secondary MSK Cluster | ||
All the steps as mentioned in the *Setting up the Primary MSK Cluster* section will also apply while creating a secondary MSK cluster. | ||
Please review the below instructions before starting to provision the secondary MSK cluster: | ||
* Its assumed that both the clusters will be in the same AWS account but in different regions | ||
* The CIDR range of the VPC housing the MSK clusters in both the regions must not overlap with each other otherwise VPC peering will not work as needed later in this section | ||
e.g. Primary region CIDR: 10.0.0.0/16 and Secondary region CIDR: 172.31.0.0/16 | ||
## Connecting the Primary and Secondary AWS Regions | ||
In this section we will be connecting the primary and secondary AWS regions to enable communication between them. | ||
* Follow the instructions on the link below to set up VPC peering between the primary and secondary MSK cluster VPC's | ||
https://docs.aws.amazon.com/vpc/latest/peering/create-vpc-peering-connection.html#create-vpc-peering-connection-local | ||
* Add a new inbound rule to the primary cluster VPC's default SG to allow *All traffic* from the CIDR range of the secondary cluster VPC's | ||
* Add a new inbound rule to the secondary cluster VPC's default SG to allow *All traffic* from the CIDR range of the primary cluster VPC's | ||
Refer this link for instructions: https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html | ||
## Next | ||
[Setting Up MirrorMaker2](2_Setting_Up_MirrorMaker2.md) | ||
## Resources | ||
* [Useful Kafka Commands](Useful_Kafka_Commands.md) | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
# Setting up MirrorMaker2 Geo-Replication | ||
Follow the instructions to set up MM2 replication between the primary and secondary AWS MSK clusters. | ||
|
||
The instructions provide guidance to set up the active-passive MM2 topology where the data from the primary clusters gets replicated into the secondary cluster. | ||
|
||
![active-passive](../images/active-passive.png) | ||
|
||
However, this setup can be easily extended to achieve active-active MM2 replication topology as well. | ||
|
||
> **_NOTE:_** Apache Kafka version 2.7 and above have incorporated the automated consumer offset sync functionality. Any version below 2.7 would result into not replicating the consumer offsets across clusters. | ||
On this page | ||
|
||
<!-- @import "[TOC]" {cmd="toc" depthFrom=2 depthTo=6 orderedList=false} --> | ||
|
||
<!-- code_chunk_output --> | ||
|
||
- [Provision MM2 EC2 Instance](#provision-mm2-ec2-instance) | ||
- [Configure VPC SG Inbound Rules](#configure-vpc-sg-inbound-rules) | ||
- [Configure MM2 EC2 Instance](#configure-mm2-ec2-instance) | ||
- [Start MM2 Replication](#start-mm2-replication) | ||
- [Next](#next) | ||
- [Resources](#resources) | ||
|
||
<!-- /code_chunk_output --> | ||
|
||
### Provision MM2 EC2 Instance | ||
Following the principle of remote consume and local produce, we will provision an EC2 instance in the secondary AWS region. | ||
Follow the steps as listed here for provisioning an EC2 instance: | ||
|
||
https://docs.aws.amazon.com/efs/latest/ug/gs-step-one-create-ec2-resources.html | ||
|
||
While following the EC2 instance creation steps make sure to incorporate the below three instructions | ||
* Select the secondary regions MSK VPC and one of the Az | ||
* Assign the EC2 instance with ```MSK_EC2_ROLE``` IAM role as created before | ||
* Create a new security group for this VPC instead and name it *mm2-sg* | ||
|
||
### Configure VPC SG Inbound Rules | ||
Add an inbound rules to the secondary clusters VPC default SG to allow *All traffic* from the *mm2-sg* security group. | ||
|
||
### Configure MM2 EC2 Instance | ||
Once provisioned, follow the below steps: | ||
* SSH into the instance | ||
* Install Java 8 by running | ||
|
||
```shell | ||
sudo yum install java-1.8.0 | ||
``` | ||
|
||
* Download and extract Apache Kafka 2.7.1 binaries | ||
```shell | ||
wget https://www.apache.org/dyn/closer.cgi?path=/kafka/2.7.1/kafka_2.13-2.7.1.tgz | ||
tar -xzf kafka_2.13-2.7.1.tgz | ||
rm kafka_2.13-2.7.1.tgz | ||
``` | ||
* Create a folder named /tmp on the client machine. Then, go to the bin folder of the Apache Kafka installation and run the following command, replacing JDKFolder with the name of your JDK folder. | ||
```shell | ||
cp /usr/lib/jvm/<JDKFolder>/jre/lib/security/cacerts /tmp/kafka.client.truststore.jks | ||
``` | ||
e.g. | ||
```shell | ||
cp /usr/java/jdk1.8.0_141/jre/lib/security/cacerts /tmp/kafka.client.truststore.jks | ||
``` | ||
* CD into kafka_2.13-2.7.1 directory | ||
* Create a new file ```mm2.properties``` inside the kafka_2.13-2.7.1 directory | ||
|
||
```shell | ||
vim mm2.properties | ||
``` | ||
|
||
* Update the contents of the file as shown below after replacing appropriate values for sections indicated with ```#__#``` | ||
```shell | ||
# Kafka datacenters. | ||
clusters = source, target | ||
source.bootstrap.servers = #_primary MSK clusters broker addresses_# | ||
target.bootstrap.servers = #_secondary MSK clusters broker addresses_# | ||
# Source and target clusters configurations. | ||
source.config.storage.replication.factor = 3 | ||
target.config.storage.replication.factor = 3 | ||
source.offset.storage.replication.factor = 3 | ||
target.offset.storage.replication.factor = 3 | ||
source.status.storage.replication.factor = 3 | ||
target.status.storage.replication.factor = 3 | ||
# Enable the MM2 flow | ||
source->target.enabled = true | ||
# set this to true to enable bidirectional replication | ||
target->source.enabled = false | ||
# Authentication settings for source MSK | ||
source.security.protocol=SASL_SSL | ||
source.sasl.mechanism=SCRAM-SHA-512 | ||
source.ssl.truststore.location=/tmp/kafka.client.truststore.jks | ||
source.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \ | ||
username="#_USER_NAME_USED_OF_SASL_AUTH_#" \ | ||
password="#_PWD_USED_OF_SASL_AUTH_#"; | ||
# Authentication settings for target MSK | ||
target.security.protocol=SASL_SSL | ||
target.sasl.mechanism=SCRAM-SHA-512 | ||
target.ssl.truststore.location=/tmp/kafka.client.truststore.jks | ||
target.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \ | ||
username="#_USER_NAME_USED_OF_SASL_AUTH_#" \ | ||
password="#_PWD_USED_OF_SASL_AUTH_#"; | ||
# Mirror maker configurations | ||
offset-syncs.topic.replication.factor = 3 | ||
heartbeats.topic.replication.factor = 3 | ||
checkpoints.topic.replication.factor = 3 | ||
# all topics and all consumer groups | ||
topics = .* | ||
groups = .* | ||
topics.blacklist = .*[\-\.]internal, .*\.replica, __consumer_offsets | ||
groups.blacklist = console-consumer-.*, connect-.*, __.* | ||
# number of MM2 tasks to run on this instance | ||
tasks.max = 2 | ||
replication.factor = 3 | ||
refresh.topics.enabled = true | ||
sync.topic.configs.enabled = true | ||
refresh.topics.interval.seconds = 10 | ||
# enable consumer offsets synchronisation | ||
source->target.sync.group.offsets.enabled = true | ||
# target->source.sync.group.offsets.enabled = true | ||
# Enable heartbeats and checkpoints. | ||
source->target.emit.heartbeats.enabled = true | ||
source->target.emit.checkpoints.enabled = true | ||
# target->source.emit.heartbeats.enabled = true | ||
# target->source.emit.checkpoints.enabled = true | ||
``` | ||
Refer this link for more MM2 configurations and their description | ||
https://kafka.apache.org/documentation/#georeplication-mirrormaker | ||
|
||
|
||
### Start MM2 Replication | ||
CD into kafka_2.13-2.7.1 directory if not already and run the following command to start replication | ||
```shell | ||
./bin/connect-mirror-maker.sh mm2.properties | ||
``` | ||
If everything is correctly configured then you should see a similar output on the console | ||
![MM2_Start](../images/MM2_Start.png) | ||
|
||
## Next | ||
[Setting up Prometheus and Grafana](3_Setting_Up_Prometheus.md) | ||
|
||
## Resources | ||
* [Useful Kafka Commands](Useful_Kafka_Commands.md) | ||
* [Kafka 2.7.0 RELEASE_NOTES](https://downloads.apache.org/kafka/2.7.0/RELEASE_NOTES.html) | ||
* [KIP-382: MirrorMaker 2.0](https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0) | ||
* [MM2 Geo-Replication](https://kafka.apache.org/documentation/#georeplication) | ||
* [Migrating Clusters Using Apache Kafka's MirrorMaker](https://docs.aws.amazon.com/msk/latest/developerguide/migration.html) | ||
* [MM2 Setup AWS Lab](https://amazonmsk-labs.workshop.aws/en/migration/overview.html) | ||
* [MM2 Topologies](https://www.instaclustr.com/apache-kafka-mirrormaker-2-practice/#) | ||
* [A look inside Kafka Mirrormaker 2](https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/) |
Oops, something went wrong.