Skip to content

Commit

Permalink
Fix diversity subsampling comments and messages
Browse files Browse the repository at this point in the history
  • Loading branch information
ahmig committed Jan 4, 2024
1 parent 8243da5 commit 3fe3731
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 6 deletions.
2 changes: 1 addition & 1 deletion template.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ samples is squared in red.](`r params$tree_ml`){#fig-tree_ml}
### Nucleotide diversity comparison

Nucleotide diversity (π) has been calculated for $`r div_values[["boot.reps"]]`$ random
sample subsets of size $`r div_values[["sample.size"]]`$, extracted with replacement
sample subsets of size $`r div_values[["sample.size"]]`$, extracted
from the context dataset. The distribution of the nuclotide diversity is assumed to
`r div_values[["norm.text"]]` be normal after performing a Shapiro-Wilk test
(p-value of $`r div_values[["normal.pvalue"]]`$).
Expand Down
8 changes: 4 additions & 4 deletions workflow/scripts/download_context.R
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ dataframes <- lapply(
# Join results
metadata <- bind_rows(dataframes)

log_info("Removeing overlapping sequences")
log_info("Removing overlapping sequences")
# Checkpoint: remove samples that overlap with target samples according to GISAID ID
samples.accids <- sample.metadata %>%
pull(snakemake@params[["samples_gisaid_accession_column"]])
Expand All @@ -103,10 +103,10 @@ metadata <- metadata %>% filter(!accession_id %in% samples.accids)
print(glue("{nrow(metadata)} accession_ids remaining after GISAID ID filter"))

# Checkpoint: enforce a minimum number of samples to have at least
# as many possible combinations as bootstrap replicates.
# as many possible combinations as random subsample replicates.
# This is done by calculating the root of a function based on the
# formula for calculating combinations with replacement
# for n ≥ r ≥ 0: combinations with replacement = n! / (r! (n-r)!)
# formula for calculating combinations for n ≥ r ≥ 0:
# combinations = n! / (r! (n-r)!)
r <- nrow(sample.metadata)
min.comb <- snakemake@params[["min_theoretical_combinations"]]
solution <- uniroot(
Expand Down
2 changes: 1 addition & 1 deletion workflow/scripts/report/diversity_plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ diversity <- nuc.div(study_aln)


# Perform bootstrap
log_info("Performing bootstraped calculation for nucleotide diversity in oontext samples")
log_info("Performing calculation for nucleotide diversity in context samples")
plan(multisession, workers = snakemake@threads)
divs <- boot.nd.parallel(gene_ex, length(study_aln), snakemake@params[["bootstrap_reps"]])
plan(sequential)
Expand Down

0 comments on commit 3fe3731

Please sign in to comment.