Skip to content

Conversation

CodeJitsu42
Copy link

This PR introduces support for multiple platforms.

It includes a Dockerfile and instructions detailing how users can utilize docker build and docker buildx to create multi-architecture images.

I suggest the possibility of automating the image-building process by integrating it into a job triggered whenever there are code changes in this GitHub repository.

Multi-architecture Dockerfile.
Building with docker and docker buildx for multiple platform support.
@nelio2k
Copy link
Member

nelio2k commented May 21, 2024

I tried building the image and running the container locally. However, I think more work is needed... unless I'm doing something wrong?

docker run xdcr-differ:1.0.0 -u Administrator -p password -h 127.0.0.1:9000 -s B1 -t B2 -r C2
Exporting http://Administrator:password@127.0.0.1:9000
/runDiffer.sh: line 45: ps: command not found

Perhaps the runDiffer.sh needs to be modified to check for . dockerenv and do docker top instead in various places?

@nelio2k
Copy link
Member

nelio2k commented May 21, 2024

Another question I have is: If say a docker container runs the differ and files are output to mutationDiffDetails, as an example... how is the user supposed to view it if the docker container exits and is cleaned up? Is there a way to automate the "copying out" of the diff details?

How does it work right now in a production environment?

@Aditya-Sood
Copy link
Contributor

Perhaps the runDiffer.sh needs to be modified to check for . dockerenv and do docker top instead in various places?

I think this should be taken care of from the Dockerfile itself, by using the appropriate base image and adding the steps for downloading necessary packages so that the script can run unchanged

@CodeJitsu42 could you elaborate on the tests/runs you've performed with this Dockerfile? (and also whether they cover Neil's query on viewing the actual diff result)

@Aditya-Sood
Copy link
Contributor

@CodeJitsu42 I'm also facing the same command not found error when trying to run the image, so you need to add the step to install the package using yum, or chose a different final base image than redhat/ubi9-minimal:9.4

additionally you can add a .dockerignore file with an entry for go.mod, since it's created during the local setup of differ

Adding a dockerignore file
Using ubi9 instead of ubi9-minimal
@CodeJitsu42
Copy link
Author

Hello @Aditya-Sood / @nelio2k

added a .dockerignore file and changed the final image to redhat/ubi9:9.4

To see the reports, for now, we would suggest to add a "--stdout" parameter, in the runDiffer.sh , to do something like a cat the reports to stdout post successful run.

In real scenario after we have the image we will use it in a k8s helm chart and have volume mounts for reports and other things.

@CodeJitsu42
Copy link
Author

@Aditya-Sood @nelio2k
Hello Guys have you had the time to review and merge the above?
Let me know if you require additional information

@nelio2k
Copy link
Member

nelio2k commented Mar 7, 2025

We have been pre-occupied with others but can and will re-review soon. Thanks.

@nelio2k
Copy link
Member

nelio2k commented Mar 7, 2025

Thanks for the Dockerfile setup and modification, @CodeJitsu42 .

I had to modify the Dockerfile and the runDiffer.sh a tad different to make sure that the build process works properly (since the latest crypto library requires golang 1.23), and that the ps command isn't portable to be executed within a docker container.

Diff here shown:

diff --git a/Dockerfile b/Dockerfile
index fe55d7f..f118a89 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,4 +1,4 @@
-ARG BUILDER_IMAGE=docker.io/golang:1.22
+ARG BUILDER_IMAGE=docker.io/golang:1.23
 ARG FINAL_IMAGE=docker.io/redhat/ubi9:9.4
 ARG http_proxy
 ARG https_proxy
diff --git a/runDiffer.sh b/runDiffer.sh
index 452d37c..edf7409 100755
--- a/runDiffer.sh
+++ b/runDiffer.sh
@@ -41,14 +41,18 @@ EOF
 }

 function waitForBgJobs {
-       local mainPid=$1
-       local mainPidCnt=$(ps -ef | grep -v grep | grep -c $mainPid)
-       local jobsCnt=$(jobs -l | grep -c "Running")
-       while (((($jobsCnt > 0)) && (($mainPidCnt > 0)))); do
-               sleep 1
-               jobsCnt=$(jobs -l | grep -c "Running")
-               mainPidCnt=$(ps -ef | grep -v grep | grep -c $mainPid)
-       done
+       local pid=$1
+    # Check if the PID is provided
+    if [[ -z "$pid" ]]; then
+        echo "Usage: wait_for_pid <pid>"
+        return 1
+    fi
+
+    # Loop until the process no longer exists
+    while kill -0 "$pid" 2>/dev/null; do
+        echo "Waiting for PID $pid to terminate..."
+        sleep 20
+    done
 }

 function killBgTail {

The run command also needs to be executed with network host, I believe...

docker run --network host xdcr-differ:1.0.0 -u Administrator -p wewewe -h 192.168.56.2:8091 -s B1 -t B2 -r C2

However, I'm seeing network refused error... whereas if I run from the host natively, it is fine:
r$ ./runDiffer.sh -u Administrator -p wewewe -h 192.168.56.2:8091 -s B1 -t B2 -r C2 <- this is fine

I know that the docker is reaching out to the Couchbase Server, because if I specify an external IP address, it fails:

docker run --network host xdcr-differ:1.0.0 -u Administrator -p wewewe -h 127.0.0.1:15000 -s B1 -t B2 -r C2
...
2025-03-07T22:35:16.393Z WARN GOXDCR.MetadataSvc: metakv.ListAllChildren failed. path=/remoteCluster/, err=Get "http://127.0.0.1:15000/_metakv/remoteCluster/": Unable to find given hostport in cbauth database: `127.0.0.1:15000'

^ The above message proves that the docker container should use the host network, and is able to reach out to a Couchbase Server... but it fails in binding.

I suspect this has to do with the limitation shown on the docker host network guide:
https://docs.docker.com/engine/network/drivers/host/

Processes inside the container cannot bind to the IP addresses of the host because the container has no direct access to the interfaces of the host.

But I'm not 100% sure.

In any case, I think this POC shows that dockerization is a good idea and we're on our way there, but it requires some more thought and investigation. This PR is a good start, but it's probably more nuanced than just a Dockerfile.

I have filed https://jira.issues.couchbase.com/browse/MB-65665 to track this whole dockerization effort. More investigation effort will be detailed there as we march towards adding this functionality.

@CodeJitsu42
Copy link
Author

Hi Nelio,

Thanks for finding the time to update on this PR.

I have created a new branch from main and opened a new PR to main, which can be found here:
#118

In this PR:

  • The Dockerfile was enhanced.
  • A new Pod CR was created.
  • The script was updated.
  • The entire setup was tested locally with success

Please find attached the logs from xdcr-differ.
xdcr.output.log

Let me know if you have any questions.

CodeJ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants