For convenience, in addition to the manual test instructions we provide a Makefile
based test automation. This will automate all the manual steps referred to in Testing Operator Deployment on Kubernetes. In addition the scorecard
test from the Operator-SDK will be executed.
This is currently tested on Kubernetes in Docker but should work on other Kubernetes systems as well.
You need the following installed on your local machine:
- Linux or macOS host
- Docker
- make
- KIND (if no existing Kubernetes cluster is available via
KUBECONFIG
or in~/.kube/config
)
Important: Notice, that this script uses a container to execute the test. Your KUBECONFIG
will be bind mounted into the container. Therefor not config-helpers or references to files on your host machine are allowed. This is usually the case for minikube
or GKE clusters.
All further dependencies are encapsulated in a container image that this Makefile
will execute as a test driver.
The Makefile
supports two test modes. Both have these supported options:
OP_PATH
- relative path to your operator (required)
OP_VER
- version of operator (if not provided the latest will be determined from your package.yaml
)
OP_CHANNEL
- channel of operator if is not provided it will be parsed by operator package yaml or use the default ones
VERBOSE
- enable verbose output of executed subcommands
Using operator-courier
this test verify your CSV and the package definitionmore detail in the docs. As part of this test nothing will be changed on your system.
Example, run from the top-level directory of this repository:
make operator.verify OP_PATH=upstream-community-operators/cockroachdb VERBOSE=1
Pulling docker image [ Processing ]
Using default tag: latest
latest: Pulling from dmesser/operator-testing
Digest: sha256:457953575cd7bd2af60e55fb95f0413195e526c3bbe74b6de30faaf2f10a0585
Status: Image is up to date for quay.io/dmesser/operator-testing:latest
Pulling docker image [ OK ]
Lint Operator metadata [ Processing ]
WARNING: csv metadata.annotations.certified not defined. [2.0.9/cockroachdb.v2.0.9.clusterserviceversion.yaml]
WARNING: csv metadata.annotations.certified not defined. [2.1.1/cockroachdb.v2.1.1.clusterserviceversion.yaml]
Lint Operator metadata [ OK ]
Using the Operator Lifecycle Manager(OLM) your Operator will be packaged into a temporary catalog, containing all currently published community operators and yours. OLM will be installed for you if not present.
Using the current community catalog as a base, allows you to test with dependencies on Operators currently published in this catalog. If you have dependencies outside of this catalog, you need to prepare your own cluster, install OLM and ship a catalog with these dependencies present, otherwise installation will fail.
You can either provide an Kubernetes cluster as a testbed via KUBECONFIG
or ~/.kube/confg
. If you have multiple cluster context configured in your KUBECONFIG
you will be able to select one. If you have no cluster configured or reachable the Makefile will install a kind
cluster named operator-test
for you
For this type of test, additionally the following options exist:
NO_KIND
- if set to 1
no attempt to bring up a kind cluster will be made. In this case you need to specify CATALOG_IMAGE
CATALOG_IMAGE
- when NO_KIND
is set to 1
you need to specify a container registry image location you have push privileges for and from which the image can be pulled again later by OLM without authentication. This parameter is ignored when NO_KIND
is absent or set to 0
since the catalog image can be loaded directly into a KIND cluster.
CLEAN_MODE
- any of NORMAL
, NONE
and FORCE
. As the test installs OLM components in your Kubernetes cluster this controls the clean up of those. In NORMAL
clean up will happen if no errors occured. When set to NONE
clean up is ommitted, when set to FORCE
clean up will always be done. Default is NORMAL
.
INSTALL_MODE
- any of OwnNamespace
, SingleNamespace
, AllNamespaces
. this controls the installation mode of the Operator and should be set according to what your Operator states as supported in the installModes
section of the CSV. Default is SingleNamepsace
.
You can start by just deploying your Operator:
make operator.install OP_PATH=upstream-community-operators/cockroachdb
Pulling docker image [ Processing ]
Pulling docker image [ OK ]
Find kube config [ /home/dmesser/.kube/config ]
Find kube cluster [ Not found ]
Start KIND [ Processing ]
Start KIND [ OK ]
Building catalog image [ Processing ]
Building catalog image [ OK ]
Operator version detected [ 1.7.2 ]
Creating namespace [ Processing ]
Creating namespace [ OK ]
Verify operator [ Processing ]
Verify operator [ OK ]
Install OLM [ Processing ]
Install OLM [ OK ]
Building manifests [ Processing ]
Building manifests [ OK ]
Operator Deployment [ Processing ]
Applying object to cluster [ Processing ]
Applying object to cluster [ OK ]
Checking if subscriptions passes [ Processing ]
Checking if subscriptions passes [ OK ]
Checking if CSV passes [ Processing ]
Checking if CSV passes [ OK ]
Operator Deployment [ OK ]
This way you can test if your Operator is packaged correctly.
You can also run a test that will deploy your Operator and checks if it behaves correctly according to scorecard
(which is part of the Operator-SDK). scorecard
will use the example CRs defined in metadata.annotations.alm-examples
in the CSV to try to use your Operator and observe its behavior.
Example, run from the top-level directory of this repository:
[...]
make operator.test OP_PATH=upstream-community-operators/cockroachdb
[...]
Instrumenting Operator for test [ Processing ]
creating CR files [ Processing ]
creating CR files [ OK ]
injecting scorecard proxy [ Processing ]
injecting scorecard proxy [ OK ]
Instrumenting Operator for test [ OK ]
Running scorecard trough all supplied CRs [ Processing ]
Running required tests [ Processing ]
Running required tests [ OK ]
Running recommended tests [ Processing ]
Running recommended tests [ OK ]
Running required tests [ Processing ]
Running required tests [ OK ]
Running recommended tests [ Processing ]
Running recommended tests [ OK ]
Running scorecard trough all supplied CRs [ OK ]
Cleaning up Operator resources [ Processing ]
Cleaning up Operator resources [ OK ]
Cleaning up Operator definition [ Processing ]
Cleaning up Operator definition [ OK ]
Cleaning up namespace [ Processing ]
Cleaning up namespace [ OK ]
Here are some common scenarios, why your test can fail:
ERROR: metadata.annotations.alm-examples contains invalid json string [1.4.4/my-operator.v1.4.4.clusterserviceversion.yaml]
The linter checks for valid JSON in metadata.annotations.alm-examples
. The rest of the CSV is supposed to be YAML.
my-operator.v2.1.11 specifies replacement that couldn't be found
Explanation: This happens because the catalog cannot load your Operator since it's pointing to a non-existing previous version of your Operator using spec.replaces
. For updates, it is important that this property points to another, older version of your Operator that is already in the catalog.
error adding operator bundle : error decoding CRD: no kind \"CustomResourceDefinition\" is registered for version \"apiextensions.k8s.io/v1\" in scheme \"pkg/registry/bundle.go
Explanation: Currently OLM does not yet support handling CRDs using apiextensions.k8s.io/v1
. This will improve soon. Until then you need to resort back to apiextensions.k8s.io/v1beta
.
error loading manifests from directory: error checking provided apis in bundle : couldn't find charts.someapi.k8s.io/v1alpha1/myapi (my-custom-resource) in bundle. found: map[]
Explanation: Your Operator claims ownership of a CRD that it does not ship. Check for spelling of Group/Version/Kind in spec.customresourcedefinitions.owned
in the CSV.
error loading package into db: [FOREIGN KEY constraint failed, no default channel specified for my-operator]
Explanation: This happens when either your Operator package defines more than one channel in package.yaml
but does not define defaultChannel
. Or when the package just defines a single channel (in which case you can omit defaultChanel
) but the catalog couldn't load the CSV that this channel points to using currentCSV
. This can happen when in the CSV the specified name in metadata.name
is actually different from what currentCSV
points to.
Check if subscription passes
times out
Explanation: In this case the Subscription
object created by test suite did not transition to the state AtLatestKnown
before hitting a timeout. There are various reasons for this, ranging from the catalog pod crashing to problems with the catalog-operator
pod of OLM itself. In any case, the logs of either pod will likely help troubleshooting and finding the root cause.
Check if CSV passes
times out
Explanation: OLM could not install the Operator's Deployment
from its CSV before hitting a timeout. This is usually due to Deployment
reaching its expected replica count, likely because the pod is crash-looping.
failed to get proxyPod: timed out waiting for the condition:
Explanation: This happened likely the Operator pod crashed in the middle of the scorecard test suite. For example, when it failed to parse a Custom Resource that scorecard feeds of the list in metadata.annotations.alm-examples
. OLM will wait for the Deployment
of the Operator to recover until re-installing the Operator. Re-installation changes the Operator pod's name and hence scorecard fails to reach the logs of scorecard proxy using its old name.
failed to create cr resource: object is being deleted: someapi.k8s.io "myCRD" already exists:
Explanation: This can happen when your Operator automatically creates a CR on startup, with the same name of an example for that CR provided in the CSV metadata.annotations.alm-examples
section. Simply use a different name in the example. Otherwise, your Operator could be slow to delete a CR due to a finalizer.
Like explained above (CLEAN_MODE
), by default, if all tests run correctly, anything that got installed of on your cluster as part of the test will be deleted. If something fails, the deployed resource will not be deleted in order to give you a chance to debug.
After you finished debugging you can use the following command to clean up any residual resources as part of a test of a particular Operator:
make operator.cleanup OP_PATH=upstream-community-operators/cockroachdb
Install a kind
cluster as a testbed for the Operator deployment.
$ kind create cluster --name operator-test
This command will create a Kubernetes in Docker cluster:
$ kind get clusters
operator-test
$ kind get nodes --name operator-test
operator-test-control-plane
Install OLM to an existing cluster (determined via KUBECONFIG
or ~/.kube/config
).
make olm.install