Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NDK versions are being removed without grace period #10599

Closed
2 of 13 tasks
thomaseizinger opened this issue Sep 11, 2024 · 12 comments
Closed
2 of 13 tasks

NDK versions are being removed without grace period #10599

thomaseizinger opened this issue Sep 11, 2024 · 12 comments

Comments

@thomaseizinger
Copy link

Description

We build an Android app against the NDK version installed on the GitHub runners. During updates to the runner images, the NDK appears to get bumped and the previous default is no longer available, causing builds to fail.

We cannot control, which image version gets used so our CI is effectively blocked.

Platforms affected

  • Azure DevOps
  • GitHub Actions - Standard Runners
  • GitHub Actions - Larger Runners

Runner images affected

  • Ubuntu 20.04
  • Ubuntu 22.04
  • Ubuntu 24.04
  • macOS 12
  • macOS 13
  • macOS 13 Arm64
  • macOS 14
  • macOS 14 Arm64
  • Windows Server 2019
  • Windows Server 2022

Image version and build link

https://github.com/firezone/firezone/actions/runs/10804517190/job/29970095787?pr=6564

Is it regression?

Yes, the latest non-prerelease doesn't have the issue.

Expected behavior

The previous default NDK version to not be removed without a grace-period.

Actual behavior

The default NDK version changes and the old one is no longer available.

Repro steps

  1. Create an Android app and build against the current NDK version.
  2. GitHub decides to randomly(?) use "pre-release" images for some CI runs, failing the pipeline.
@hemanthmanga
Copy link
Contributor

Hi @thomaseizinger Thank you for bringing this issue to us. We are looking into this issue and will update you on this issue after investigating.

@kishorekumar-anchala
Copy link
Contributor

Hi @thomaseizinger ,

We created announcement last month about removal of old NDK versions, please find the announcement. thank you !

@thomaseizinger
Copy link
Author

thomaseizinger commented Sep 12, 2024

That is not the issue. We were already building against NDK 27.0.12077973.

The problem is that we can only ever build against one NDK version and the latest image update removed NDK version 27.0.12077973 and instead installed 27.1.12297006.

This update appears to have been rolled out incrementally because we saw some CI builds failing and some passing.

This is the fix we had to make: firezone/firezone#6662. But initially, this PR also didn't pass CI reliably because the update was not yet rolled out to all runners.

We don't have any ability to specify, which runner image version we get which essentially leaves us in a broken state: We can only rerun failed CI builds in the hope that we get an old image version with the previous NDK and eventually merge the PR where we use the newer NDK.

Instead of removing the old NDK, can you first add the new NDK to an image release? That would give us time to migrate to the new version. Subsequently, you can then remove the previous version without breaking CI.

@kishorekumar-anchala
Copy link
Contributor

HI @thomaseizinger ,

That is not the issue. We were already building against NDK 27.0.12077973.

The problem is that we can only ever build against one NDK version and the latest image update removed NDK version 27.0.12077973 and instead installed 27.1.12297006.

Yes, for every rollout new version will be automatically fetched if it is exist .

Instead of removing the old NDK, can you first add the new NDK to an image release? That would give us time to migrate to the new version. Subsequently, you can then remove the previous version without breaking CI.

yes , before deleting any major versions we will raise an announcement as mentioned above .

When coming to minor and hotfix versions will automatically fetched available versions as per script , if new versions available that will be coming with new rollout .

Thank you ! , hoping you build got succeed with latest release , kindly provide your confirmation on it .

@thomaseizinger
Copy link
Author

Instead of removing the old NDK, can you first add the new NDK to an image release? That would give us time to migrate to the new version. Subsequently, you can then remove the previous version without breaking CI.

yes , before deleting any major versions we will raise an announcement as mentioned above .

It is nice that you make an announcement that CI will be broken for a couple of days. It would be better if you wouldn't break CI for a couple of days.

The issue is that you can only build an Android App against a single, specific NDK version. Your rollout seems to be incremental, which makes sense. But it means that during the rollout, it is a lottery, whether we are getting a version with the new or the old NDK version, so we are in the following scenario:

  1. main references version 27.0.12077973. Some CI runs will pass because they run on machines that haven't upgraded yet.
  2. As the rollout progressed, more and more CI runs will fail because the NDK version doesn't exist.
  3. At some point in the rollout, we have to force-merge a PR that changes the NDK version 27.1.12297006.
  4. Now, CI runs that are given the new NDK version will pass.
  5. Still, there will be CI runs on runners that are still using the old version and those will fail.

This is a huge service disruption because we cannot merge PRs reliably during this period, see:

image

When coming to minor and hotfix versions will automatically fetched available versions as per script , if new versions available that will be coming with new rollout .

Instead of replacing the version, can you fetch two versions? The one previously installed and whatever is the latest at the time? That way, the rollout doesn't break CI and we can update to the new NDK version after the rollout is completed.

@kishorekumar-anchala
Copy link
Contributor

The issue is that you can only build an Android App against a single, specific NDK version. Your rollout seems to be incremental, which makes sense. But it means that during the rollout, it is a lottery, whether we are getting a version with the new or the old NDK version, so we are in the following scenario:

During the rollout the the existing version will not be changed. the new NDK version available once image rollout completed in to both GitHub Runners and Hosted agents.

Instead of replacing the version, can you fetch two versions? The one previously installed and whatever is the latest at the time? That way, the rollout doesn't break CI and we can update to the new NDK version after the rollout is completed.

Currently, we do not have plans to fetch both the previously installed and the latest NDK versions. we confirm that ubuntu images have latest NDK version . i hope you're CIs ran successfully .

@thomaseizinger
Copy link
Author

The issue is that you can only build an Android App against a single, specific NDK version. Your rollout seems to be incremental, which makes sense. But it means that during the rollout, it is a lottery, whether we are getting a version with the new or the old NDK version, so we are in the following scenario:

During the rollout the the existing version will not be changed. the new NDK version available once image rollout completed in to both GitHub Runners and Hosted agents.

That is not true in our experience. We have seen several CI builds that use a "pre-release" image which will have a newer NDK version and thus fail.

Instead of replacing the version, can you fetch two versions? The one previously installed and whatever is the latest at the time? That way, the rollout doesn't break CI and we can update to the new NDK version after the rollout is completed.

Currently, we do not have plans to fetch both the previously installed and the latest NDK versions.

What is your suggestion then to implement reliable CI for Android apps that use the NDK?

@kishorekumar-anchala
Copy link
Contributor

Hi @thomaseizinger ,

Pin your NDK version in your CI configuration. This ensures that the build environment remains consistent and reduces the risk of unexpected breaks due to NDK updates.

@thomaseizinger
Copy link
Author

I can do that yeah. I am not sure what the point of the pre-installed NDK is then if your advice is to install a separate version?

@kishorekumar-anchala
Copy link
Contributor

Hi @thomaseizinger ,

  1. We will announce major version removals in advance.
  2. As per image policy , system will automatically fetch the default version i.e. 27.1.12297006( latest NDK) ;
  3. Further pinning of particular version is not possible.
    Thank you!

@thomaseizinger
Copy link
Author

We will announce major version removals in advance.

Android only allows to build against a specific NDK, thus even a change to a patch version of the NDK is breaking and not just major versions.

I'll spell it out again because my point doesn't seem to be coming across: Removing tools without a grace period for transitioning leads to broken CI even if you announce it ahead of time.

What is the point of pre-installed software if I can't use it because you are swapping it out from underneath me? GitHub actions doesn't expose a way for me to select, which image version of the runner I will get, you need to either expose a mechanism for that or install multiple versions of certain tools side-by-side. Without either of these things, every image update will break CI for many developers.

@thomaseizinger
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants