You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: vignettes/rCNV.Rmd
+6-6
Original file line number
Diff line number
Diff line change
@@ -112,7 +112,7 @@ head(rels)
112
112
113
113
From the relatedness plots, we can see that the relatedness in Parrot fish samples is low. We recommend removing one sample of the pairs with outlier values above 0.9.
Figure 1.3. a. Relatedness among samples, b. Heterozygosity in Parrotfish populations
118
118
@@ -178,18 +178,18 @@ a. Allele ratios (across all samples), b. Proportions of homo-/hetero-zygotes pe
178
178
179
179
Alternative allele ratio is calculated across all the samples from depth values; Proportion of homo/hetero-zygotes are also calculated across all the samples per SNP; Depth ratio is calculated per individual per SNP by dividing the alternative allele depth value by the total depth value of both alleles. This is calculated for both heterozygotes and homozygotes; the Z-score per SNP is calculated according to the following equation:
180
180
181
-

181
+

182
182
183
-

183
+

184
184
185
185
Where *N<sub>i</sub>* is the total depth for heterozygote *i* at SNP *x*, *N<sub>Ai</sub>* is the alternative allele read depth for heterozygote *i* at SNP *x*,
186
186
*p* is the probability of sampling allele *A* in SNP *x* - for unbiased sequencing, this is *0.5*. The `allele.info()` function calculates this for both *0.5* and biased probability using the ratio between reference and alternative alleles.
187
187
188
188
The Chi-squared values per SNP per sample were calculated using the following equation:
189
189
190
-

190
+

191
191
192
-

192
+

193
193
194
194
Where *N<sub>i</sub>* is the total depth for heterozygote *i* at SNP *x*, *N<sub>Ai</sub>* is the alternative allele read-depth for heterozygote *i* at SNP *x*, *p* is the probability of sampling allele *A* at SNP *x* in heterozygotes - in unbiased sequencing, this is *0.5*, *n* is the number of heterozygotes at SNP *x*.
195
195
@@ -273,7 +273,7 @@ table(CV$dup.stat)
273
273
We test the validity of detected duplicates in two ways: 1. direct detection through a sliding window and 2. variant fixation index
274
274
275
275
* The sliding window method assess the duplication along chromosomes/scaffolds on a given size (e.g., 10,000 bp) sliding window. This step assumes that ideally, if a locus is truly located in a multi-copy region, close-by SNPs should also be classified as deviants while deviation caused by sequencing error should be more scattered along the genome. The putative duplicates that do not follow this assumption will be flagged as low-confident and can be removed if desired. The function `dup.validate` in the rCNV package is dedicated to detecting regions enriched for deviant SNPs within a sliding window along a chromosome, scaffold, or a sequence of any given length.
276
-
**Check the function example in the package for how to use it. Note that this method is still being tested**
276
+
**Check the function example in the package for how to use it. Note that this method is still being tested**
277
277
278
278
* Variant fixation index ($V_{ST}$ : Redon et al. 2006) is analogous to population fixation index ($F_{ST}$) and calculated as $V_{ST} = V_S/V_T$ where $V_T$ is the variance of normalized read depths among all individuals from the two populations and VS is the average of the variance within each population, weighed for population size; and used to identify distinct CNV groups between populations (See Dennis et al., 2017; Weir & Cockerham, 1984)
279
279
The function `vst` can be used to calculate the $V_{ST}$ values per population and plot genetic distance in a network qgraph.
0 commit comments