Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools: Adds a Sonobuoy conformance testing script #530

Merged
merged 1 commit into from
Dec 9, 2019

Conversation

etungsten
Copy link
Contributor

@etungsten etungsten commented Nov 18, 2019

Issue #, if available: N/A

Description of changes:
Adds sonobuoy conformance testing scripts that spins up an EKS cluster
then runs sonobuoy conformance testing, after its done, it retrieves the
results and spins down the EKS cluster.

setup-test-cluster.sh sets up the cluster and generates env file and user data
run-conformance-test.sh consumes the generated env file and user data to launch thar worker nodes and runs sonobuoy conformance tests.
clean-test-cluster.sh cleans up the test cluster specified by the env file.

Borrows several helper functions from the amiize.sh script.

Testing:
Running in us-west-2:

Setup cluster:

$ ./setup-test-cluster.sh --region us-west-2 --cluster-name sono-test
Setting up fresh EKS cluster with eksctl
[ℹ]  eksctl version 0.9.0
[ℹ]  using region us-west-2
[ℹ]  setting availability zones to [us-west-2b us-west-2d us-west-2a]
[ℹ]  subnets for us-west-2b - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ]  subnets for us-west-2d - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ]  subnets for us-west-2a - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ]  nodegroup "ng-fe807768" will use "ami-05d586e6f773f6abf" [AmazonLinux2/1.14]
[ℹ]  using Kubernetes version 1.14
[ℹ]  creating EKS cluster "sono-test" in "us-west-2" region
[ℹ]  will create 2 separate CloudFormation stacks for cluster itself and the initial nodegroup
[ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=sono-test'
[ℹ]  CloudWatch logging will not be enabled for cluster "sono-test" in "us-west-2"
[ℹ]  you can enable it with 'eksctl utils update-cluster-logging --region=us-west-2 --cluster=sono-test'
[ℹ]  Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "sono-test" in "us-west-2"
[ℹ]  2 sequential tasks: { create cluster control plane "sono-test", create nodegroup "ng-fe807768" }
[ℹ]  building cluster stack "eksctl-sono-test-cluster"
[ℹ]  deploying stack "eksctl-sono-test-cluster"
[ℹ]  building nodegroup stack "eksctl-sono-test-nodegroup-ng-fe807768"
[ℹ]  --nodes-min=0 was set automatically for nodegroup ng-fe807768
[ℹ]  --nodes-max=0 was set automatically for nodegroup ng-fe807768
[ℹ]  deploying stack "eksctl-sono-test-nodegroup-ng-fe807768"
[✔]  all EKS cluster resources for "sono-test" have been created
[✔]  saved kubeconfig as "/home/ANT.AMAZON.COM/etung/.kube/config"
[ℹ]  adding identity "arn:aws:iam::722737851570:role/eksctl-sono-test-nodegroup-ng-fe8-NodeInstanceRole-1R4WLQLMF3FMT" to auth ConfigMap
[ℹ]  kubectl command should work with "/home/ANT.AMAZON.COM/etung/.kube/config", try 'kubectl get nodes'
[✔]  EKS cluster "sono-test" in "us-west-2" region is ready
Writing kubeconfig for sono-test to sono-test-config
[ℹ]  eksctl version 0.9.0
[ℹ]  using region us-west-2
[✔]  saved kubeconfig as "sono-test-config"
Updating the CNI plugin version
daemonset.extensions/aws-node patched
Generating userdata file for launching Thar worker nodes
Getting instance profile for use in instance launches
Setting up security groups
Generating env file for launching Thar worker nodes
Finished setting up test EKS cluster.
Userdata file: sono-test-user-data.toml
Env file: sono-test.env

Run conformance tests:

$ ./run-conformance-test.sh --cluster-env-file sono-test.env --node-ami ami-07245e9300b9290c1 --instance-type m5.large
Launching 3 Thar worker nodes
Waiting for all Thar worker nodes to become 'Ready' in sono-test cluster
ready: 1
ready: 1
ready: 2
ready: 3
Starting Sonobuoy Kubernetes conformance test! Test may take up to 60 minutes to finish
INFO[0000] created object                                name=sonobuoy namespace= resource=namespaces
INFO[0000] created object                                name=sonobuoy-serviceaccount namespace=sonobuoy resource=serviceaccounts
INFO[0000] created object                                name=sonobuoy-serviceaccount-sonobuoy namespace= resource=clusterrolebindings
INFO[0000] created object                                name=sonobuoy-serviceaccount namespace= resource=clusterroles
INFO[0000] created object                                name=sonobuoy-config-cm namespace=sonobuoy resource=configmaps
INFO[0000] created object                                name=sonobuoy-plugins-cm namespace=sonobuoy resource=configmaps
INFO[0000] created object                                name=sonobuoy namespace=sonobuoy resource=pods
INFO[0000] created object                                name=sonobuoy-master namespace=sonobuoy resource=services
         PLUGIN     STATUS   RESULT   COUNT
            e2e   complete   passed       1
   systemd-logs   complete   passed       1
   systemd-logs   complete                2

Sonobuoy has completed. Use `sonobuoy retrieve` to get results.
Plugin: e2e
Status: passed
Total: 3586
Passed: 204
Failed: 0
Skipped: 3382

Plugin: systemd-logs
Status: passed
Total: 3
Passed: 3
Failed: 0
Skipped: 0
Sonobuoy test results available at sono-test-conformance-test-results/201911292327_sonobuoy_89e4a4ff-1e63-4ec4-9500-307a9c73cdbb.tar.gz
Cleaning up Sonobuoy namespace, may take up to 20 minutes
INFO[0000] deleted                                       kind=namespace namespace=sonobuoy
INFO[0000] deleted                                       kind=clusterrolebindings
INFO[0000] deleted                                       kind=clusterroles
Cleaning up Thar worker node instances
TERMINATINGINSTANCES	i-0270f18beddc66746
CURRENTSTATE	32	shutting-down
PREVIOUSSTATE	16	running
TERMINATINGINSTANCES	i-0d4343f1475f3c379
CURRENTSTATE	32	shutting-down
PREVIOUSSTATE	16	running
TERMINATINGINSTANCES	i-0ba340884d9bd4f3b
CURRENTSTATE	32	shutting-down
PREVIOUSSTATE	16	running

Clean up cluster:

$ ./clean-up-test-cluster.sh --cluster-env-file sono-test.env
Removing security group dependencies.
Deleting the test cluster.
[ℹ]  eksctl version 0.9.0
[ℹ]  using region us-west-2
[ℹ]  deleting EKS cluster "sono-test"
[✔]  kubeconfig has been updated
[ℹ]  cleaning up LoadBalancer services
[ℹ]  2 sequential tasks: { 2 parallel sub-tasks: { cleanup for nodegroup "ng-f98bc379", delete nodegroup "ng-f98bc379" }, delete cluster control plane "sono-test" [async] }
[ℹ]  trying to cleanup dangling network interfaces
[ℹ]  will delete stack "eksctl-sono-test-nodegroup-ng-f98bc379"
[ℹ]  waiting for stack "eksctl-sono-test-nodegroup-ng-f98bc379" to get deleted
[ℹ]  will delete stack "eksctl-sono-test-cluster"
[✔]  all cluster resources were deleted
Deleting env file, userdata file, and kubeconfig file
Clean up done.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@tjkirch
Copy link
Contributor

tjkirch commented Nov 19, 2019

Talked with @etungsten a bit outside of the PR. I think we should aim for separating the setup and run portions. That way developers can leave the infrastructure and just do a quick test run when desired, and CI can spin up fresh as needed.

@etungsten
Copy link
Contributor Author

etungsten commented Nov 26, 2019

Split the monolithic script into three parts: setup-test-cluster.sh, run-conformance-test.sh and clean-up-test-cluster.sh.
Adds a README to help clarify usage.

@tjkirch
Copy link
Contributor

tjkirch commented Nov 26, 2019

Would you please update the testing for the new script separation? (I don't think we need to see two regions; your assurance that it works the same is enough there :))

@etungsten
Copy link
Contributor Author

etungsten commented Nov 26, 2019

Updated testing description.
Minor fixes to clean-up-test-cluster.sh


echo "Setting up fresh EKS cluster with eksctl"
eks_cluster_creation_attempted=true
eksctl create cluster -r "${REGION}" --zones "${ZONES}" -n "${CLUSTER_NAME}" --nodes 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does --nodes 0 here cause eksctl to setup the nodegroup resources without starting nodes? Does it create an ASG that can later be scaled?

Copy link
Contributor Author

@etungsten etungsten Nov 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does! Yes it does!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with @etungsten offline: the ASG aspect won't be useful to us until we have an eksctl in place that has the desired CNI plugin configured (for aws-node) and the TOML userdata generation.

In talking with @tjkirch about how #363 could help reduce scope, I think we came to an understanding that it would be reasonable to rely on a patched version if the upstream gave positive indications that they'd accept thar node-ami-family contributions. Please correct me if I misunderstood @tjkirch 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In talking with @tjkirch about how #363 could help reduce scope, I think we came to an understanding that it would be reasonable to rely on a patched version if the upstream gave positive indications that they'd accept thar node-ami-family contributions. Please correct me if I misunderstood @tjkirch

I think that's a minimum requirement, but we should still discuss with the team to make sure we'd want to suggest that (1) users need a patched tool, or (2) we have branching instructions.

tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

Removes setting up a completely new instance profile.
Patching the ConfigMap is no longer necessary as a result.
Launch Thar nodes using the nodegroup instance profile set up by eksctl.

Copy link
Contributor

@tjkirch tjkirch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review - I think GitHub is preventing me from responding to other comments because I have these pending..?

tools/conformance-test/clean-up-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/README.md Outdated Show resolved Hide resolved
Copy link
Member

@jahkeup jahkeup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! There's a few scripty things I had questions on as well as sonobouy result outputs that I want to understand a bit more.

tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/clean-up-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/clean-up-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/clean-up-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

etungsten commented Nov 26, 2019

Addresses @tjkirch 's and @jahkeup 's comments.
Testing changes

Force push below fixes missing newline at the end of file.

@etungsten etungsten force-pushed the sonobuoy-conformance-script branch 2 times, most recently from 6318744 to 9b98362 Compare November 26, 2019 22:26
@etungsten
Copy link
Contributor Author

etungsten commented Nov 26, 2019

Redirect hash output to /dev/null in favour of more explicit error messages.

@etungsten etungsten force-pushed the sonobuoy-conformance-script branch 2 times, most recently from 58b9c46 to dc41822 Compare November 27, 2019 00:28
@etungsten etungsten force-pushed the sonobuoy-conformance-script branch 3 times, most recently from 05db246 to d418c97 Compare November 28, 2019 00:13
@tjkirch
Copy link
Contributor

tjkirch commented Dec 2, 2019

GitHub merged force-pushes; here's the full diff from the changes etungsten made 2019-11-27:

https://github.com/amazonlinux/PRIVATE-thar/compare/231ef76..d418c97

tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/clean-up-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

etungsten commented Dec 2, 2019

Addresses @tjkirch 's comments:

  • Removed mapfile
  • Moved deleting the env files to after cluster deletion completion
  • Capture kubectl --kubeconfig ... and sonobuoy --kubeconfig .. in variables
  • Specify source group ids for the ingress and egress security group rules
  • Added set -o pipefail to where things are being piped
  • Got rid of ec2 wait
  • Moved sonobuoy delete to after instance clean up

Testing again to make sure everything still works.

tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/clean-up-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/setup-test-cluster.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

Addresses @tjkirch 's comments.

tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
@etungsten etungsten force-pushed the sonobuoy-conformance-script branch 2 times, most recently from 7fb4f86 to 91190b8 Compare December 3, 2019 18:08
@etungsten
Copy link
Contributor Author

etungsten commented Dec 3, 2019

Removes the added security group egress and security group ingress rules during clean up.
It created interdependency between cluster security groups which causes CloudFormation deletion failures.

tools/conformance-test/run-conformance-test.sh Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

Addresses @tjkirch's comment

Adds sonobuoy conformance testing scripts that spins up EKS clusters
and several thar worker nodes then runs sonobuoy conformance testing,
after its done, it retrieves the results and spins down the EKS cluster.
@tjkirch tjkirch requested review from jahkeup and removed request for jahkeup December 6, 2019 18:20
Copy link
Member

@jahkeup jahkeup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Let's get going with this and revisit if needed. 👍

@etungsten etungsten merged commit b2f1355 into develop Dec 9, 2019
@etungsten etungsten deleted the sonobuoy-conformance-script branch December 9, 2019 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants