Tweak extract values based on Echo Runs [VS-1432] #8979

rsasch · 2024-09-13T14:46:08Z

No description provided.

RoriCremer · 2024-09-13T14:55:13Z

scripts/variantstore/docs/aou/AOU_DELIVERABLES.md

@@ -172,6 +172,7 @@ You can take advantage of our existing sub-cohort WDL, `GvsExtractCohortFromSamp
    - Specify the same `call_set_identifier`, `dataset_name`, `project_id`, `extract_table_prefix`, and `interval_list` that were used in the `GvsPrepareRangesCallset` run documented above.
    - Specify the `interval_weights_bed` appropriate for the PGEN extraction run you are performing. `gs://gvs_quickstart_storage/weights/gvs_full_vet_weights_1kb_padded_orig.bed` is the interval weights BED used for Quickstart.
    - Select the workflow option "Retry with more memory" and choose a "Memory retry factor" of 1.5
+    - Set the `extract_maxretries_override` input to 5, `split_intervals_disk_size_override` to 1000, `scatter_count` to 25000, and `y_bed_weight_scaling` to 8 to start; you will likely have to adjust one or more of these values in subsequent attempts.


this all looks good--esp since you are using real world experience from Echo
Do we have any record of the thought process behind this? I know at one point there was a doc?
anyway, LGTM but I'm def eluded by why these numbers are ideal

Off the top of my head:

split_intervals_disk_size_override to 1000 because otherwise we run out of disk space 😭

scatter_count - this one I think may vary depending on the interval list? Pretty sure the default 34K is almost guaranteed to give us Cromwell problems (and maybe that should be changed)

y_bed_weight_scaling we were doing both X and Y at a scale factor of 4 and Y shards were still the laggards. We might even go higher than 8 but we didn't actually try that during Echo.

koncheto-broad

LGTM. Seems superficially odd that we ultimately want to reduce the disk size for pgen on larger callsets, but I trust the results of your analysis

rsasch · 2024-09-13T14:58:53Z

LGTM. Seems superficially odd that we ultimately want to reduce the disk size for pgen on larger callsets, but I trust the results of your analysis

Disk size is less of a concern because (at least with GCP historically) going too low on disk size risks much slower I/O without commiserate savings.

koncheto-broad · 2024-09-13T15:07:39Z

LGTM. Seems superficially odd that we ultimately want to reduce the disk size for pgen on larger callsets, but I trust the results of your analysis

Disk size is less of a concern because (at least with GCP historically) going too low on disk size risks much slower I/O without commiserate savings.

Yes, that's also what I thought as well. That's why it seemed a little odd that this PR included a change that lowered the default disk size on the VMs if they weren't specified from 500 to 200. I always thought going too low was the worry, and that disk size in general isn't much of a concern. But this PR appears to lower the default disk size by 60%, unless I am reading that completely wrong.

rsasch · 2024-09-13T15:13:41Z

LGTM. Seems superficially odd that we ultimately want to reduce the disk size for pgen on larger callsets, but I trust the results of your analysis

Disk size is less of a concern because (at least with GCP historically) going too low on disk size risks much slower I/O without commiserate savings.

Yes, that's also what I thought as well. That's why it seemed a little odd that this PR included a change that lowered the default disk size on the VMs if they weren't specified from 500 to 200. I always thought going too low was the worry, and that disk size in general isn't much of a concern. But this PR appears to lower the default disk size by 60%, unless I am reading that completely wrong.

The logs showed a max of 7% disk space, so it seemed fair to reduce it somewhat. But your comment reminded me that I also meant to adjust the effective_extract_memory_gib calculation, so I will do that now.

mcovarr · 2024-09-13T16:36:06Z

No VCF extract changes?

rsasch · 2024-09-13T16:38:59Z

No VCF extract changes?

It looked like running it with the 1.5 retry with more memory did the trick, unless you have other info?

rsasch added 2 commits September 13, 2024 10:20

adjustments based on first stab at Echo runs

ee06db9

English

ec6d19e

RoriCremer reviewed Sep 13, 2024

View reviewed changes

RoriCremer approved these changes Sep 13, 2024

View reviewed changes

koncheto-broad approved these changes Sep 13, 2024

View reviewed changes

adjust effective_extract_memory_gib sizes

f042176

rsasch merged commit cab8d52 into ah_var_store Sep 13, 2024
20 of 21 checks passed

rsasch deleted the rsa_vs_1432 branch September 13, 2024 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tweak extract values based on Echo Runs [VS-1432] #8979

Tweak extract values based on Echo Runs [VS-1432] #8979

rsasch commented Sep 13, 2024

RoriCremer Sep 13, 2024

mcovarr Sep 13, 2024

koncheto-broad left a comment

rsasch commented Sep 13, 2024

koncheto-broad commented Sep 13, 2024

rsasch commented Sep 13, 2024

mcovarr commented Sep 13, 2024

rsasch commented Sep 13, 2024

Tweak extract values based on Echo Runs [VS-1432] #8979

Tweak extract values based on Echo Runs [VS-1432] #8979

Conversation

rsasch commented Sep 13, 2024

RoriCremer Sep 13, 2024

Choose a reason for hiding this comment

mcovarr Sep 13, 2024

Choose a reason for hiding this comment

koncheto-broad left a comment

Choose a reason for hiding this comment

rsasch commented Sep 13, 2024

koncheto-broad commented Sep 13, 2024

rsasch commented Sep 13, 2024

mcovarr commented Sep 13, 2024

rsasch commented Sep 13, 2024