multithreading not working #15

steffenheyne · 2018-06-19T08:51:00Z

Hi!

I use eg.
cond1_fit = enrichR(treatment = cond1_bam$name, control = cond1_bam_input$name, genome = genome, countConfig = countConfigPE, procs = 10,verbose = TRUE)

but I always get only two threads running at 100%, no matter what I specify for procs=?

The text was updated successfully, but these errors were encountered:

your-highness · 2018-06-19T08:56:02Z

Dear @steffenheyne ,

During quantification of the signals (read counts) in the bam file, the method uses at most 2 threads, i.e. one for treatment and control bam each -- The rationale behind being the I/O bottleneck.

Only the downstream fitting and enrichment quantification procedures utilizes the specified procs argument.

Best,

steffenheyne · 2018-06-19T09:29:44Z

ok,I see

... but I think the usage of more threads would be still useful as bamsignals really profits from and with at least 10 threads we don't have any I/O issues on our cluster ... so leave this decision better to the user would be nice

your-highness · 2018-06-19T10:10:41Z

Dear @steffenheyne ,

You are right! On clusters I/O usually scales well with multhreading when the data is cached on a local disk. Unfortunately, bamsignals does not come with multithreading support :(

In fact, normR utilizes parallel::mcmapply() for counting with bamsignals::bamProfile() for treatment and control simulateously - which leads to your observation of two working threads:

normR/R/methods.R

Lines 131 to 143 in c5f8d1b

    
           counts <- parallel::mcmapply( 
        
             bamsignals::bamProfile, bampath=c(treatment, control), 
        
             MoreArgs=list(gr=gr, binsize=countConfig@binsize, 
        
                           mapq=countConfig@mapq, 
        
                           shift=countConfig@shift, 
        
                           paired.end=getFilter(countConfig), 
        
                           tlenFilter=countConfig@tlenFilter, 
        
                           filteredFlag=countConfig@filteredFlag, 
        
                           verbose=FALSE), 
        
             mc.cores=procs, SIMPLIFY=FALSE 
        
           ) 
        
           counts[[1]] <- unlist(as.list(counts[[1]])) 
        
           counts[[2]] <- unlist(as.list(counts[[2]]))

Alternatively, I routinely used a wrapper around bamsignals::bamCount for multithreading based on chromosomes and fit enrichR directly with obtained counts:

processByChromosome <- function(bam.files, gr, mapqual, procs) {
  require( bamsignals )
  x <- parallel::mclapply(
         X = as.character(unique(seqnames(gr))), 
         FUN = function(chunk) {
           gr.sub <- gr[ seqnames(gr) %in% chunk]
           lapply( bam.files, count, gr=gr.sub, mapqual=mapqual, paired.end=paired.end, verbose=F, paired.end.midpoint=paired.end)
  }, mc.cores=procs)
  invisible(
    list("treament"=unlist(lapply(x, "[[",1)), "control" =unlist(lapply(x, "[[",2)) 
  )
}
counts <- processByChromosome( bam.files=c(treatment.bampath, control.bampath), gr=gr, mapqual=mapqual, procs=procs)

Note that this code does not include all the bamCountConfig parameters. I see what I can do to add this feature to normR in the next days or so.

Best,

steffenheyne · 2018-06-19T11:20:31Z

yeah I see, thanks!
...I somehow remembered that epicseg is quite fast on counting due to multithreading, but it uses exactly your suggestion mclapply()

your-highness · 2021-09-24T06:43:44Z

Working on this currently... Will be in normR v1.19 bioc release

your-highness self-assigned this Jun 19, 2018

your-highness added the enhancement label Jun 19, 2018

your-highness added this to the 1.0 milestone Jun 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multithreading not working #15

multithreading not working #15

steffenheyne commented Jun 19, 2018

your-highness commented Jun 19, 2018

steffenheyne commented Jun 19, 2018

your-highness commented Jun 19, 2018 •

edited

Loading

steffenheyne commented Jun 19, 2018

your-highness commented Sep 24, 2021

multithreading not working #15

multithreading not working #15

Comments

steffenheyne commented Jun 19, 2018

your-highness commented Jun 19, 2018

steffenheyne commented Jun 19, 2018

your-highness commented Jun 19, 2018 • edited Loading

steffenheyne commented Jun 19, 2018

your-highness commented Sep 24, 2021

your-highness commented Jun 19, 2018 •

edited

Loading