Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci3 #10775

Open
4 of 32 tasks
ludamad opened this issue Dec 16, 2024 · 0 comments
Open
4 of 32 tasks

ci3 #10775

ludamad opened this issue Dec 16, 2024 · 0 comments
Assignees

Comments

@ludamad
Copy link
Collaborator

ludamad commented Dec 16, 2024

Full CI3:

  • Real-time regenerate inputs for theverify_honk_proof acir tests.
  • Make ci image actually just be the dev container image, with all dev tools e.g. zsh, htop etc.
  • Rebuild and publish build-image on change. Must require the version number to have changed (i.e. protect against overwritting the existing image).
  • Figure out how to rebuild and change the build-instance ami in aws to ensure we don't always need to pull image (maybe too much effort, but need to consider the data transfter costs. ECR?).
  • Bake CRS into AMI and mount it into container so we don't have to download about 2GB...
  • Docs build.
  • BB gcc build? Or on masterly? Can we use -fsyntax-only
  • Improve our rebuild patterns. Support e.g. inversions !some/script/that/should/not/trigger
  • Enable "merge queues" and have it follow a different "masterly" workflow.
  • Create the final slim release container, from the s3 cache artefacts, on masterly.
  • Run benchmarks on masterly.
  • Deploy if the merge was a release.
  • Rebuild sysbox to ensure it reflects toolchain changes (e.g. move to global yarn 4).
  • Make obscure errors more explicit (e.g. jq error on vk key gen)
  • Delete every Dockerfile, Earthfile, bash script, and Yaml file, that isn't needed, or can be replaced with something simpler.
  • Make a failing fast with no local changes print large announcement to alert ci team.
  • Contact tests and aztec-nr tests (txe means yarn-project circular).
  • Fix commented noir js tests (bb circular).
  • Make bb/.js crs download only download whats needed to extend an existing CRS. Will make the process idempotent and then we don't need to worry about process races.
  • test-all: Skipping tests matching test_caches_open|requests in noir tests.
  • test-all: kv-store excluded cos mocha.
  • test-all: noir js packages tests?
  • test-all: Standardise how tests are skipped so we can see in one place.
  • Maintain a fleet of spots that will live for up to e.g. 1 hour idle. Any spots request will come from the fleet if available. Will give faster startup times.
  • Put time -v / ulimit on tests and alert/kill on mem usage.

Disabled flakes:

charlielye added a commit that referenced this issue Dec 17, 2024
CI3 is a conceptual goal for uniting the CI flow and the dev flow as
much as possible, adding more depth to the bootstrap and build scripts
to be able to handle our needs.

This PR introduces all the work on CI3 so far, but still has an earthly
caller shell to make sure we can minimize the number of variables that
have changed at once.

There is a lot of changes in this PR.
See https://github.com/AztecProtocol/aztec-packages/pull/10711/files for
a subset of the changes without yarn.lock etc noise.

The big picture:
- The CI build has been made much less stateful. ci.yml now uses the ci3
bootstrap pattern, without fully moving off the earthly targets just
yet.
- The S3 cache mechanism is now the main cache mechanism. Note there is
no persistent disk now supporting the build.
There is a global cache on S3, readable without auth, that caches them
for 10 days. We no longer think of the build in terms of docker/buildkit
layers but instead as chunks that have different rebuild patterns that
match files in the monorepo.
- Moving to yarn 4.5.2. 

Niceties:
- faster builds due to script improvements and distributed cache
uploading by default
- work is more properly isolated in chunks from the above effort
- spot recovery is implemented, retrying with on-demand
- we no longer use github runners, side-stepping lots of edge-cases, and
instead rely on our builder realizing there is no work to do / hitting a
timeout via shutdown -P
- Docker images are no longer copied from the builder, meaning a large
class of flake is gone.

Non-niceties:
- The earthly setup is much less granular. There is two stages that have
their own one-layer builds. The earthly cache is fairly redundant, using
the S3 cache for most meaningful caching. (earthly will not be used in
ci.yml in the future)
- Some CI files are now duplicated, we will do a follow-on pass to get
rid of earthly helpers, build-system, etc
- CI currently also downloads the CI image fresh each time, will change
- we are currently pushing images to dockerhub with no expiration,
should move to ECR
- noir-projects currently retries once in the Earthfile as a last minute
issue was hit, will be fixed in a follow-up
- Issues: #10775

---- 
WORKFLOW AFTER THIS PR:
- Run ./bootstrap.sh in root to bootstrap with cache, ./bootstrap.sh
full otherwise
- Run earthly +ci in root to 
- Put ci3 in your cache and note the ways to interact with ci in that
folder
- Note the new commands in ./bootstrap.sh like test-kind-network
- In yarn-project to run a single e2e test now use `test:e2e`. `test`
just runs the unit tests as per other projects.

---------

Co-authored-by: MirandaWood <miranda@aztecprotocol.com>
Co-authored-by: Charlie Lye <karl.lye@gmail.com>
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
ludamad added a commit that referenced this issue Dec 19, 2024
CI3 is a conceptual goal for uniting the CI flow and the dev flow as
much as possible, adding more depth to the bootstrap and build scripts
to be able to handle our needs.

This PR introduces all the work on CI3 so far, but still has an earthly
caller shell to make sure we can minimize the number of variables that
have changed at once.

There is a lot of changes in this PR.
See https://github.com/AztecProtocol/aztec-packages/pull/10711/files for
a subset of the changes without yarn.lock etc noise.

The big picture:
- The CI build has been made much less stateful. ci.yml now uses the ci3
bootstrap pattern, without fully moving off the earthly targets just
yet.
- The S3 cache mechanism is now the main cache mechanism. Note there is
no persistent disk now supporting the build.
There is a global cache on S3, readable without auth, that caches them
for 10 days. We no longer think of the build in terms of docker/buildkit
layers but instead as chunks that have different rebuild patterns that
match files in the monorepo.
- Moving to yarn 4.5.2. 

Niceties:
- faster builds due to script improvements and distributed cache
uploading by default
- work is more properly isolated in chunks from the above effort
- spot recovery is implemented, retrying with on-demand
- we no longer use github runners, side-stepping lots of edge-cases, and
instead rely on our builder realizing there is no work to do / hitting a
timeout via shutdown -P
- Docker images are no longer copied from the builder, meaning a large
class of flake is gone.

Non-niceties:
- The earthly setup is much less granular. There is two stages that have
their own one-layer builds. The earthly cache is fairly redundant, using
the S3 cache for most meaningful caching. (earthly will not be used in
ci.yml in the future)
- Some CI files are now duplicated, we will do a follow-on pass to get
rid of earthly helpers, build-system, etc
- CI currently also downloads the CI image fresh each time, will change
- we are currently pushing images to dockerhub with no expiration,
should move to ECR
- noir-projects currently retries once in the Earthfile as a last minute
issue was hit, will be fixed in a follow-up
- Issues: #10775

---- 
WORKFLOW AFTER THIS PR:
- Run ./bootstrap.sh in root to bootstrap using the (publicly available)
S3 cache, ./bootstrap.sh
full to force a full build
- Run ./bootstrap.sh ci to test the in-progress 'full CI3' locally
- Run ./ci.sh ec2 to test the in-progress 'full CI3' on an isolated
runner
- For ci2.5, use earthly +ci in root to simulate ci.yml. This shares the
S3 cache. Make sure to alias earthly to scripts/earthly_local.
- Put ci3 in your cache and note the ways to interact with ci in that
folder
- Recommended workflow, as commits are now needed to run earthly or
bootstrap_ec2:
```
# in repo root
./ci.sh draft && git commit -am "blobs work" && git push && earthly +ci
```
Other useful ci.sh commands are gha-url to see the last github job
associated with your branch.
- Notable commands in ./bootstrap.sh are test-kind-network, test-e2e,
images-e2e
- In yarn-project to run a single e2e test now use `test:e2e`. `test`
just runs the unit tests as per other projects.

---------

Co-authored-by: ludamad <adam.domurad@gmail.com>
Co-authored-by: MirandaWood <miranda@aztecprotocol.com>
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
Co-authored-by: ludamad <domuradical@gmail.com>
AztecBot pushed a commit to AztecProtocol/barretenberg that referenced this issue Dec 20, 2024
CI3 is a conceptual goal for uniting the CI flow and the dev flow as
much as possible, adding more depth to the bootstrap and build scripts
to be able to handle our needs.

This PR introduces all the work on CI3 so far, but still has an earthly
caller shell to make sure we can minimize the number of variables that
have changed at once.

There is a lot of changes in this PR.
See https://github.com/AztecProtocol/aztec-packages/pull/10711/files for
a subset of the changes without yarn.lock etc noise.

The big picture:
- The CI build has been made much less stateful. ci.yml now uses the ci3
bootstrap pattern, without fully moving off the earthly targets just
yet.
- The S3 cache mechanism is now the main cache mechanism. Note there is
no persistent disk now supporting the build.
There is a global cache on S3, readable without auth, that caches them
for 10 days. We no longer think of the build in terms of docker/buildkit
layers but instead as chunks that have different rebuild patterns that
match files in the monorepo.
- Moving to yarn 4.5.2. 

Niceties:
- faster builds due to script improvements and distributed cache
uploading by default
- work is more properly isolated in chunks from the above effort
- spot recovery is implemented, retrying with on-demand
- we no longer use github runners, side-stepping lots of edge-cases, and
instead rely on our builder realizing there is no work to do / hitting a
timeout via shutdown -P
- Docker images are no longer copied from the builder, meaning a large
class of flake is gone.

Non-niceties:
- The earthly setup is much less granular. There is two stages that have
their own one-layer builds. The earthly cache is fairly redundant, using
the S3 cache for most meaningful caching. (earthly will not be used in
ci.yml in the future)
- Some CI files are now duplicated, we will do a follow-on pass to get
rid of earthly helpers, build-system, etc
- CI currently also downloads the CI image fresh each time, will change
- we are currently pushing images to dockerhub with no expiration,
should move to ECR
- noir-projects currently retries once in the Earthfile as a last minute
issue was hit, will be fixed in a follow-up
- Issues: AztecProtocol/aztec-packages#10775

---- 
WORKFLOW AFTER THIS PR:
- Run ./bootstrap.sh in root to bootstrap using the (publicly available)
S3 cache, ./bootstrap.sh
full to force a full build
- Run ./bootstrap.sh ci to test the in-progress 'full CI3' locally
- Run ./ci.sh ec2 to test the in-progress 'full CI3' on an isolated
runner
- For ci2.5, use earthly +ci in root to simulate ci.yml. This shares the
S3 cache. Make sure to alias earthly to scripts/earthly_local.
- Put ci3 in your cache and note the ways to interact with ci in that
folder
- Recommended workflow, as commits are now needed to run earthly or
bootstrap_ec2:
```
# in repo root
./ci.sh draft && git commit -am "blobs work" && git push && earthly +ci
```
Other useful ci.sh commands are gha-url to see the last github job
associated with your branch.
- Notable commands in ./bootstrap.sh are test-kind-network, test-e2e,
images-e2e
- In yarn-project to run a single e2e test now use `test:e2e`. `test`
just runs the unit tests as per other projects.

---------

Co-authored-by: ludamad <adam.domurad@gmail.com>
Co-authored-by: MirandaWood <miranda@aztecprotocol.com>
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
Co-authored-by: ludamad <domuradical@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants