-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not all of the samples requested have provided input #40
Comments
hmm... yeah, I can't really seem to find anything wrong with what you have, either
Also, it might help debug things if you comment out the Also, regarding the output directory being ignored: It's because I've set up the |
Thanks for the quick reply.
I'm not very familiar with snakemake, so trying to execute snakefile line-by-line to debug provided limited insight:
FWIW, the pipeline ran when I moved all required files into the same varCA subdirectory, "data" ("DATA" prompted a similar error message), made sample names non-numerical, and removed "." not preceding required extensions (i.e. name.extension) Unfortunately, the run failed, but I'm guessing this isn't related?
|
ah, I just thought of something! When you write 2294 in a YAML file, it automatically gets parsed as an integer. But 2294 is getting parsed as a string when it's being read from the samples file. So you just needed to put quotes around the sample names in the config file:
instead of
It probably would have also worked if you had commented out the
I'm not sure why it helped that you moved all of the files into the same subdirectory. The paths in the samples file are supposed to be interpreted relative to the directory that you execute the pipeline from.
I'm also not sure why it's having trouble with the multiple dots "." in your file names. I've looked through the code again, and there's nothing I can think of that would affect that. In fact, if anything, I think it should fail if you're missing the dot before the extension: Line 92 in fcf5909
If you have DATA/2294.dup.fixbam instead of DATA/2294.dup.fix.bam in your samples file, it will treat the file as a FASTQ file instead of a BAM file. I'll keep thinking about it and let you know if I come up with something.
The config dictionary should be defined as early as this line: Line 7 in fcf5909
Without more info, I can't be sure what the cause is. Are there any error messages in the |
Regarding the formatting issues, I will try to re-run using your suggestions after troubleshooting the issues below. For now, the below samples worked well:
Regarding the run failure, I don't see any obvious errors in qlog. I ran twice and the same step (51) failed both times.
|
hmm... yeah, I can't tell which job is the job that is failing from the log outputs that you included. Here are some debugging strategies:
One of the requirements for the BAM files is that they have read-group information? Do yours? |
Hi @aryarm, Just getting back to this. The pipeline runs past the previously mentioned errors by adding read groups. Thanks for the tips! Unfortunately, I'm confronted with a more cryptic error later on. Log here It looks like bcftools may not be installed as a dependency? Total runtime is a few hours per sample, so I wanted to check in to see if there was anything else I'm missing in the log. Thanks! |
After preparing the required input, the pipeline can't seem to find the specified files or output directory. I don't see in the log files whether or not my sample file is recognized. I am hoping that there is an obvious issue with my file paths or the config, but I'm just not seeing it. Any help would be much appreciated.
Also, sample data ran to completion.
Log file:
(snakemake) root@c844f1072fc5:/varCA# cat out/*
My config (note that the output directory I specify is ignored):
(snakemake) root@c844f1072fc5:/varCA# grep -vF '#' configs/config.yaml | sed '/^$/d'
Sample file:
(snakemake) root@c844f1072fc5:/varCA# cat DATA/data/samples.tsv
BAM and bed files referenced in samples.tsv are present:
(snakemake) root@c844f1072fc5:/varCA# ls DATA/229* | xargs -n 1 basename
As are indexes:
root@c844f1072fc5:/varCA# ls DATA/bwa | xargs -n 1 basename
And models:
(snakemake) root@c844f1072fc5:/varCA# ls DATA/data | xargs -n 1 basename
The text was updated successfully, but these errors were encountered: