-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
umi qiaseq #324
Comments
This entirely depends on the sequence of UMIs you're using. The reads are dedupped before the UMis are removed, then extracted and added to the fastq header. This is what I used with some QIAGEN miRNA UMI data: nextflow run nf-core/smrnaseq \
--input Samplesheet.csv \
--outdir 01_smrnaseq \
--genome GRCh38 -profile yourprofile\
--mirtrace_species "hsa" \
--skip_mirdeep \
--protocol "qiaseq" \
--umitools_extract_method regex \
--umitools_bc_pattern '.+(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12})(?P<discard_2>.*)' \
--save_umi_intermeds \
--with_umi \
-c extra_resources.config \
-resume |
This is weird: And the reason is that you should just supply the pattern, not an additional "=" in between. --umitools_bc_pattern = '.+(?P<discard_1>AACTGTAGGCACCATCAAT){s<=.... Should be:
I'll close this as this should be fine |
Thanks, this does seem to work, though now I'm getting the mirtrace error that others have (see #262). Also, the umitools_bc_pattern code I used came from the intro page of smrna seq so this should probably be edited to reflect the suggestion above: |
Description of the bug
I'm getting an error using the umi tools for a 2x100 qiaseq library.
The error is this:
ValueError: barcode regex(es) do not include any umi groups (starting with 'umi_') regex.Regx('=', flags=regex.v0), None
I suspect this is because the qiaseq library is not setup on a sequencer to include any base masking and so therefore the UMI's are not added into the header, which as I understand, is what the umi_ is looking for.
My current command is this:
--with_umi --umitools_extract_method regex --umitools_bc_pattern = ".+(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12}).+"
which is the command I have found in #49I'm looking for help either writing the regex differently as a workaround or if there is something wrong with the umi handling. Thanks!
Command used and terminal output
Relevant files
/*
*/
params {
config_profile_name = 'name'
config_profile_description = 'nf-core smRNAseq profile'
max_memory = '120GB'
max_cpus = 30
max_time = '24.h'
cleanup = true
}
System information
No response
The text was updated successfully, but these errors were encountered: