v0.6.0 - Sublineages and Immunity
v0.6.0
Notes
This is a major release that includes the following changes:
-
Detection of all recombinants in Nextclade dataset 2022-10-27:
XA
toXBE
. -
Implementation of recombinant sublineages (ex.
XBB.1
). -
Implementation of immune-related statistics (
rbd_level
,immune_escape
,ace2_binding
) fromnextclade
, theNextstrain
team, and Jesse Bloom's group:- https://github.com/nextstrain/ncov/blob/master/defaults/rbd_levels.yaml
- https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS_Omicron/epistatic-shifts/
- https://jbloomlab.github.io/SARS2_RBD_Ab_escape_maps/escape-calc/
- https://doi.org/10.1093/ve/veac021
- https://doi.org/10.1101/2022.09.15.507787
- https://doi.org/10.1101/2022.09.20.508745
Dataset
- Issue #168: NULL collection dates and NULL country is implemented.
controls
was updated to in include 1 strain fromXBB
for a total of 22 positive controls. The 28 negative controls were unchanged fromv0.5.1
.controls-gisaid
strain list was updated to includeXA
through toXBE
for a total of 528 positive controls. This includes sublineages such asXBB.1
andXBB.1.2
which synchronizes with Nextclade Dataset 2022-10-19. The 187 negatives controls were unchanged fromv0.5.1
.
Nextclade
- Issue #176: Upgrade Nextclade dataset to tag
2022-10-27
and upgrade Nextclade tov2.8.0
. - Issue #193: Use the nextclade dataset
sars-cov-2-21L
to calculateimmune_escape
andace2_binding
.
RBD Levels
- Issue #193: Create new rule
rbd_levels
to calculate the number of key receptor binding domain (RBD) mutations.
Lineage Tree
- Issue #185: Use nextclade dataset Auspice tree for lineage hierarchy. Previously, the phylogeny of lineages was constructed from the cov-lineages website YAML. Instead, we now use the tree provided with nextclade datasets, to better synchronize the lineage model with the output.
Rather than creating the output tree in resources/lineages.nwk
, the lineage tree will output to data/sars-cov-2_<DATE>/tree.nwk
. This is because different builds might use different nextclade datasets, and so are dataset specific output.
sc2rf
- Issue #179: Fix bug where
sc2rf/recombinants.ansi.txt
is truncated. - Issue #180: Fix recombinant sublineages (ex. XAY.1) missing their derived mutations in the
cov-spectrum_query
. Previously, thecov-spectrum_query
mutations were only based on the parental alleles (before recombination). This led to sublinaeges (ex.XAY.1
,XAY.2
) all having the exact same query. Now, thecov-spectrum_query
will include all substitutions shared between all sequences in thecluster_id
. - Issue #187: Document bug that occurs if duplicate sequences are present, and the initial validation was skipped by not running
scripts/create_profile.sh
. - Issue #191 and Issue #192: Reduce false positives by ensuring that each mode of sc2rf has at least one additional parental population that serves as the alternative hypothesis.
- Issue #195: Implement a filter on the ratio of intermissions to alleles. Sequences will be marked as false positives if the number of intermissions (i.e. alleles that conflict with the identified parental region) is greater than or equal to the number of alleles contributed by the minor parent. This ratio indicates that there is more evidence that conflicts with recombination than there is allele evidence that supports a recombinant origin.
Linelist
- Issue #183: Recombinant sublineages. When nextclade calls a lineage (ex.
XAY.1
) which is a sublineage of a sc2rf lineage (XAY
), we prioritize the nextclade assignment. - Issue #193: Add immune-related statistics:
rbd_levels
,rbd_substitutions
,immune_escape
, andace2_binding
.
Plot
- Issue #57: Include substitutions within breakpoint intervals for breakpoint plots. This is a product of Issue #180 which provides access to all substitutions.
- Issue #112: Fix bug where breakpoints plot image was out of bounds.
- Issue #188: Remove the breakpoints distribution axis (ex.
breakpoints_clade.png
) in favor of putting the legend at the top. This significant reduces plotting issues (ex. Issue #112). - Issue #193: Create new plot
rbd_level
.
Validate
Designated Lineages
- Issue #85:
XAY
, updated controls - Issue #178:
XAY.1
- Issue #172:
XBB.1
- Issue #175:
XBB.1.1
- Issue #184:
XBB.1.2
- Issue #173:
XBB.2
- Issue #174:
XBB.3
- Issue #181:
XBC.1
- Issue #182:
XBC.2
- Issue #171:
XBD
- Issue #177:
XBE
Proposed Lineages
- Issue #198:
proposed1229
- Issue #199:
proposed1268
- Issue #197:
proposed1296
Commits
2506e907
docs: update changelog and add v0.6.0 testing summary package0cc421e0
docs: update all contributorscd9b6cbb
resources: update issues0fa2e3c1
docs: update readme375c3a76
resources: add proposed lineages for #197 #198 #199dad989e7
param: remove BQ.1 from sc2rf mode VOC as its too close to BA.5.3d7cb005f
docs: update issue template lineage-validation1beac97e
resources: add XBF to curated breakpoints for #196fae7bfdb
script: sc2rf implement intermission allele ratio for #19589a41265
script: additional manual curation of lineage_treeebd3ce1f
resources: update validation strains for controls-gisaidd8bff572
script: add RBD Level slide to reportc1879c1d
script: catch errors in rbd_level plotting with no recombinants63545a08
script: fix bug in linelist with cluster_privatesc24a7179
resources: update issuesd32d557f
docs: update development notes7f825a41
script: manual fix for CK in lineage_treefdd6f66d
workflow: implement rbd levels for #1930058dd6e
param: upgrade nextclade dataset to 2022-10-27 and reduce breakpoints of XA modefb062c32
env: upgrade nextclade to v2.8.0- See CHANGELOG.md for additional commits.