You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
in order to process our D. melanogaster iCLIP library, I used snakemake to put the iCount steps together and integrated benchmarking, specifically for iCount xlsites with quantification based on cDNA and reads.
Here, I am observing runtimes of ~1 - 4 days on our cluster system for iCount xlsites. The number of reads per multiplexing barcode is quite variable, which correlates with runtime.
In terms of parameters, I use
--group_by start
mapq_th 3
using the output gtf from iCount segment
I wonder what - next to total number of mapped reads - determines the runtime of iCount xlsites and whether there are some useful pre-filtering strategies of the BAM files to speed up the process without losing (too much) sensitivity.
Cheers
The text was updated successfully, but these errors were encountered:
Are you using --segmentation input? If you do, this i the main reason that iCount xlsites is taking so long. Please run it without segmentation (AFAIK, this is the way most users do it). We should speed up the algorithm in case segmentation is given, but never found the time to do it properly
Regarding other factors that could affect runtime:
group_by should have zero effect on runtime
higher mapq_th will take into account less (poorly mapped) reads, so this should speed things up a bit. But if the quality of mapping is suffcient this should not be very significant
If you have really high coverage (>10k, 100k), lowering the max_barcodes parameter can speed up things significantly, but this should be used only in such cases.
Hi,
in order to process our D. melanogaster iCLIP library, I used snakemake to put the iCount steps together and integrated benchmarking, specifically for
iCount xlsites
with quantification based on cDNA and reads.Here, I am observing runtimes of ~1 - 4 days on our cluster system for
iCount xlsites
. The number of reads per multiplexing barcode is quite variable, which correlates with runtime.In terms of parameters, I use
using the output gtf from
iCount segment
I wonder what - next to total number of mapped reads - determines the runtime of
iCount xlsites
and whether there are some useful pre-filtering strategies of the BAM files to speed up the process without losing (too much) sensitivity.Cheers
The text was updated successfully, but these errors were encountered: