Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RW-12298] upgrade AoU docker images #4490

Closed
wants to merge 15 commits into from
Closed

[RW-12298] upgrade AoU docker images #4490

wants to merge 15 commits into from

Conversation

yonghaoy
Copy link
Contributor

Jira ticket: https://broadworkbench.atlassian.net/browse/[ticket_number]

Summary of changes

What

Why

Testing these changes

What to test

Who tested and where

  • This change is covered by automated tests
    • NB: Rerun automation tests on this PR by commenting jenkins retest or jenkins multi-test.
  • I validated this change
  • Primary reviewer validated this change
  • I validated this change in the dev environment

welder_server="us.gcr.io/broad-dsp-gcr-public/welder-server:8667bfe"
openidc_proxy="broadinstitute/openidc-proxy:2.3.1_2"
anvil_rstudio_bioconductor="us.gcr.io/broad-dsp-gcr-public/anvil-rstudio-bioconductor:3.18.0"

# Note that this is the version used currently by AOU in production, the one above can be staged for testing
terra_jupyter_aou_old="us.gcr.io/broad-dsp-gcr-public/terra-jupyter-aou:2.2.7"
terra_jupyter_aou_old="us.gcr.io/broad-dsp-gcr-public/terra-jupyter-aou:2.2.8"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which version we have in prod today? this needs to be what's in prod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. it is 2.2.7 now.

Copy link

codecov bot commented Apr 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.64%. Comparing base (2f03d6c) to head (a1c4e02).
Report is 2 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #4490   +/-   ##
========================================
  Coverage    73.64%   73.64%           
========================================
  Files          158      158           
  Lines        14694    14694           
  Branches      1162     1162           
========================================
  Hits         10822    10822           
  Misses        3872     3872           

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f03d6c...a1c4e02. Read the comment docs.

Copy link
Collaborator

@Qi77Qi Qi77Qi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yonghaoy
Copy link
Contributor Author

@yonghaoy yonghaoy requested a review from cpate4 May 1, 2024 23:38
@yonghaoy yonghaoy changed the title [RW-12298]upgrade AoU docker images [RW-12298] upgrade AoU docker images May 1, 2024
@Qi77Qi
Copy link
Collaborator

Qi77Qi commented May 2, 2024

java.lang.RuntimeException: google project could not be claimed for test, java.util.concurrent.TimeoutException: Future timed out after [5 minutes]

just triggered rerun of all tests

@yonghaoy
Copy link
Contributor Author

yonghaoy commented May 2, 2024

java.lang.RuntimeException: google project could not be claimed for test, java.util.concurrent.TimeoutException: Future timed out after [5 minutes]

just triggered rerun of all tests

java.lang.RuntimeException: google project could not be claimed for test, java.util.concurrent.TimeoutException: Future timed out after [5 minutes]
I don't think that is related

@yonghaoy
Copy link
Contributor Author

yonghaoy commented May 2, 2024

java.lang.RuntimeException: google project could not be claimed for test, java.util.concurrent.TimeoutException: Future timed out after [5 minutes]

just triggered rerun of all tests

java.lang.RuntimeException: google project could not be claimed for test, java.util.concurrent.TimeoutException: Future timed out after [5 minutes] I don't think that is related

DataBiosphere/terra-resource-buffer#294

@Qi77Qi
Copy link
Collaborator

Qi77Qi commented May 3, 2024

I have a feeling we might run out of disk...but can verify Mon on a BEE

@yonghaoy
Copy link
Contributor Author

yonghaoy commented May 3, 2024

I have a feeling we might run out of disk...but can verify Mon on a BEE
Having issues with bee, forget about how to use it.

For image size, seems not a big increase

  • gce: 42GB to 44GB
  • datapro 47 -> 47

@yonghaoy yonghaoy closed this May 3, 2024
@yonghaoy yonghaoy reopened this May 3, 2024
@LizBaldo
Copy link
Collaborator

FYI @yonghaoy and @Qi77Qi I am taking a stab at updating the docker images again with a new bioconductor release and I will try to debug this while I am at it, I will keep you posted DataBiosphere/terra-docker#488

@Qi77Qi
Copy link
Collaborator

Qi77Qi commented May 16, 2024

thanks @LizBaldo !
haven't looked recently, but last time I looked, sth is preventing runtime from restarting

@LizBaldo
Copy link
Collaborator

Got it, the only thing that I remember doing back in March is fixing the AOU image this way. I do not think that would cause issues with restarting but I thought I would still share DataBiosphere/terra-docker#476

@LizBaldo
Copy link
Collaborator

I will just point my BEE to this branch and see if I can create an AOU runtime and ssh into it

@LizBaldo
Copy link
Collaborator

We should probably follow up via slack, but here is what I found out on my BEE:

1- As Qi mentioned above, the issue is not when creating the runtime, but when resuming it
2- When sshing into the VM, I can confirm that the jupyter-server container is running, but the /home/jupyter directory is empty, and there are no jupyter server running, which is why the resume fails. I am not sure what could have caused this quite yet though.

I might need to decouple fixing this from releasing the new bioconductor and gatk images on Terra.
Screenshot 2024-05-16 at 4 10 18 PM
Screenshot 2024-05-16 at 4 10 31 PM
Screenshot 2024-05-16 at 4 10 45 PM
Screenshot 2024-05-16 at 4 11 25 PM

@LizBaldo
Copy link
Collaborator

@yonghaoy I think you can close this PR now, #4566 will get you to the latest aou image available

@yonghaoy
Copy link
Contributor Author

@yonghaoy I think you can close this PR now, #4566 will get you to the latest aou image available

ha! I might just want to delete branch after closing..

@yonghaoy yonghaoy closed this May 20, 2024
@yonghaoy yonghaoy deleted the yyu-RW-12298 branch May 20, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants