Skip to content

AJ-1550: bump listener image and chart#4154

Merged
davidangb merged 1 commit intodevelopfrom
da_AJ-1550_listenerImageBump
Jan 30, 2024
Merged

AJ-1550: bump listener image and chart#4154
davidangb merged 1 commit intodevelopfrom
da_AJ-1550_listenerImageBump

Conversation

@davidangb
Copy link
Contributor

Jira ticket: https://broadworkbench.atlassian.net/browse/AJ-1550

Summary of changes

What

The TARL image and chart bumps aim to allow k8s to restart TARL in cases where it runs into a problem starting and attaching to Relay, such as when it encounters a DNS timeout.

The PRs included in this bump are:

  1. https://github.com/broadinstitute/terra-helmfile/pull/4992
  2. AJ-1550: logging and health check for HybridConnectionListener terra-azure-relay-listeners#56
  3. AJ-1550: handle errors during startup terra-azure-relay-listeners#57

as well as two PRs which should have no impact on runtime:

  1. docs: add CONTRIBUTING.md info for leonardo update terra-azure-relay-listeners#53
  2. build(IA-4442): enforce spotless check in CI terra-azure-relay-listeners#54

Why

In practice, we've seen recurring e2e test failures which, on the surface, look like WDS failed to start within the test's allowed 10 minutes. Upon digging in, we found that in every failure (15+), WDS started fine but the listener did not, and thus nothing could actually communicate with WDS. This problem likely occurs in production, too.

Allowing k8s to restart the listener pod if it hits an error will hopefully reduce the occurrence of this problem.

Testing these changes

What to test

  • will we see a reduction in "WDS failed to start" errors in e2e tests over time?

Who tested and where

  • This change is covered by automated tests
    • NB: Rerun automation tests on this PR by commenting jenkins retest or jenkins multi-test.
  • I validated this change
  • Primary reviewer validated this change
  • I validated this change in the dev environment

@davidangb
Copy link
Contributor Author

fiab-start failure. Jenkins retest.

@codecov
Copy link

codecov bot commented Jan 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (493abbc) 71.32% compared to head (07e1c9b) 71.32%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #4154   +/-   ##
========================================
  Coverage    71.32%   71.32%           
========================================
  Files          147      147           
  Lines        13913    13913           
  Branches      1110     1110           
========================================
  Hits          9923     9923           
  Misses        3990     3990           
Flag Coverage Δ
pact 48.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 493abbc...07e1c9b. Read the comment docs.

@davidangb davidangb merged commit 2874e4d into develop Jan 30, 2024
@davidangb davidangb deleted the da_AJ-1550_listenerImageBump branch January 30, 2024 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants