-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a few additional failures to our notes doc #8980
base: ah_var_store
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some specific changes, also I think it would be useful to have a consistent way of attaching the failures to a specific workflow, sub-workflow and task for easier use.
at some point in our notes we added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but it's a little concerning there are so many failure modes that require manual intervention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry this took me SO long to review. Comments throughout.
1. Ingest failure with error message: `A USER ERROR has occurred: Cannot be missing required value for `___ | ||
1. (e.g. alternate_bases.AS_RAW_MQ, RAW_MQandDP or RAW_MQ) | ||
1. This means that there is at least one incorrectly formatted sample in your data model. Confirm your GVCFs are reblocked. If the incorrectly formatted samples are a small portion of your callset and you wish to just ignore them, simply delete the from the data model and restart the workflow without them. There should be no issue with starting from here as none of these samples were loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could link here that the full list of required information in a GVCF to work in GVS is here:
https://github.com/broadinstitute/gatk/blob/ah_var_store/scripts/variantstore/beta_docs/run-your-own-samples.md#gvcf-annotations
1. (e.g. alternate_bases.AS_RAW_MQ, RAW_MQandDP or RAW_MQ) | ||
1. This means that there is at least one incorrectly formatted sample in your data model. Confirm your GVCFs are reblocked. If the incorrectly formatted samples are a small portion of your callset and you wish to just ignore them, simply delete the from the data model and restart the workflow without them. There should be no issue with starting from here as none of these samples were loaded. | ||
1. Extract failure with OSError: Is a directory. If you point your extract to a directory that doesn’t already exist, it will not be happy about this. Simply make the directory and run the workflow again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. Extract failure with OSError: Is a directory. If you point your extract to a directory that doesn’t already exist, it will not be happy about this. Simply make the directory and run the workflow again. | |
1. Extract failure with OSError: Is a directory. | |
1. If you point your extract to a directory that doesn’t already exist, the workflow fails. Make the directory and run the workflow again. |
I'm surprised this is an error the workflow can run into because our documentation suggests people get the workspace bucket and then add a subdirectory to that based on where they want the callset written. Usually that bucket doesn't actually already exist.. Am I misunderstanding your description of this error?
Generally, could you add sections for failures in the other high level steps of GVS? Or if GVS fails in any step besides ingest does the user need to start over from scratch? It'd be great to indicate that ie "If your workflow fails in any step after ingest, unfortunately you will need to delete your BigQuery dataset and start from the beginning." ? |
1. Ingest failure: There is already a list of sample names. This may need manual cleanup. Exiting. | ||
1. Clean up the BQ dataset manually by deleting it and recreating it fresh | ||
1. Make sure to keep the call caching on and run it again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like call caching specifically needs to be off for this to be successful, based on recent user error. I am going to bet that's true for both this error and the max id is 0 error.
no automated testing needed--just documentation edits