structure testing docker volumes are convoluted and potentially problematic #147
Description
I think we might need a bit of a redesign here, couldn't solve it quickly, so I'll document the problems I see.
Currently in our build we have the following participants:
- the docker-enabled host (in case of GCCB it's a GCE instance, in case of container-builder-local it's my workstation) starts the build steps
- mvn step is the first (and only) step of our build currently, that spins up the "step-container"
- that kicks off the maven process, which has an execution phase that runs the
structure_test.sh
script structure_test.sh
downloadsext_run.sh
- which was seemingly designed to run from a docker-enabled host directly, not a container - and runs itext_run.sh
spins up a "sibling-container" (as it shares now the daemon with the "step-container") from the built openjdk image
Volumes and directories:
GCCB:
- host:
/workspace
contains the project directory (passed in) - step-container:
/workspace
is a volume mounted from the/workspace
directory from host (NOTE: in the future this might change to a separate, isolated volume, that is the clone of the /workspace directory so that the steps don't mutate the host filesystem!!) - sibling-container:
/cfg
- that is mounted with-v /workspace/openjdk8/target/classes/.cfg:/cfg
in the case of openjdk8/workspace
- that is coming from the--workspace
argument ofext_run.sh
:--workspace /workspace/openjdk8/target/test-classes/openjdk-test-common-out/workspace
for example
The problematic part is that both sibling-container volumes work currently because of the coincidental design overlap - host \workspace
is directly mounted onto \workspace
in the step-container. As soon as this symmetry breaks we have issues.
The reason is that the "sibling-container" is talking to the docker daemon on the host and as such resolves directory based docker volumes from the host.
To see where this reliance already causes problem, let's have a look at the same directories in the container-builder-local case:
- host:
/home/balopat/dev/proj/openjdk-runtime
contains the project directory (passed in) - step-container:
/workspace
is a cloned volume mounted from the/home/balopat/dev/proj/openjdk-runtime
directory (NOTE: this is intentionally designed like this and it might be the future on GCE too!) - sibling-container:
/cfg
- that is mounted with-v /workspace/openjdk8/target/classes/.cfg:/cfg
in the case of openjdk8 ==> ERROR, the directory is empty/workspace
- that is coming from the--workspace
argument ofext_run.sh
:--workspace /workspace/openjdk8/target/test-classes/openjdk-test-common-out/workspace
for example ==> ERROR, the directory is empty
Structure tests fail with this setup.
The problem:
- our current setup doesn't work directly with container-builder-local
- in the future, if/when GCCB changes the way /workspace is mounted on steps (e.g. an isolated, cloned volume), then we'll fail on GCCB as well.
Not exactly sure about the best way to go. Couple of options:
- we can modify
ext_run.sh
to prepare it for running from within a container and pass in explicitly the right volumes - we can potentially completely rewire the execution flow and rely more on GCCB/GCCB-local for each step instead of wrapping the whole flow into maven - this way we could use the runtimes-common structure tests as they were designed to be used - as a step in GCCB