Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge dev to master #194

Merged
merged 171 commits into from
Dec 16, 2024
Merged
Changes from 1 commit
Commits
Show all changes
171 commits
Select commit Hold shift + click to select a range
c37cd21
make JSON file required in vcf2gvf.py
miseminger Apr 29, 2024
9a4f86d
add new attribute, 'transcript_id', using locus_tag from JSON
miseminger Apr 29, 2024
407381c
add transcript_id in parentheses for HGVS nt names
miseminger Apr 29, 2024
ecd19c4
get rid of UserWarning for match groups
miseminger Apr 29, 2024
1a57d85
remove unused imports
miseminger Apr 29, 2024
047f085
include alias_protein_id in HGVS alias names, and make a new attribut…
miseminger Apr 29, 2024
e18dfa9
update nt_delins_regex
miseminger Apr 29, 2024
82f2959
add new attribute, 'gene_symbol', from 'gene' in the JSON file
miseminger Apr 30, 2024
418c65d
add 'gene_symbol' and 'protein_symbol' columns to mutation index
miseminger Apr 30, 2024
efcce0c
rename 'gene' column in mutation index to 'gene_name'
miseminger Apr 30, 2024
0b6f5e5
change 'protein symbol' column name to 'gene name' to match GVF and m…
miseminger Apr 30, 2024
c93085c
change 'protein symbol' column in df made from Pokay repo to 'gene na…
miseminger Apr 30, 2024
257d230
change 'gene' attribute to 'gene_name' in VCF
miseminger Jul 30, 2024
f998ed3
change 'gene' to 'gene_name' and add 'gene_symbol'
miseminger Jul 30, 2024
d3174fe
make all args required for log and index creation
miseminger Jul 30, 2024
0ff0245
update column names to match DH template, make mutation index optional
miseminger Jul 30, 2024
ed67400
fix formatting for non-index file
miseminger Jul 30, 2024
28ad84d
merge dfs on 'protein symbol' instead
miseminger Jul 30, 2024
50689b0
sort by nucleotide position
miseminger Jul 30, 2024
baf3cb0
update for new functional annotation format
miseminger Jul 30, 2024
ec976f8
update to match new functional annotation format
miseminger Jul 30, 2024
bfd06d3
rename 'alias_protein' to 'mat_pep'
miseminger Jul 30, 2024
51b3f94
add Pokay names as 'pokay_id'
miseminger Aug 8, 2024
2d72b02
add missing 'pokay_id' for ORF8
miseminger Aug 8, 2024
da12255
Add script
miseminger Sep 15, 2024
8551053
Add SARS-CoV-2 ontology terms JSONs
miseminger Sep 15, 2024
21d5b6a
Delete unwanted files
miseminger Sep 15, 2024
e6592d2
upload script
miseminger Sep 16, 2024
54bb4ac
update comment
miseminger Sep 16, 2024
8955e30
Add ontology terms to JSON, and remove pokay_id
miseminger Sep 17, 2024
4c66d82
add aliases from Sept 9 issue
miseminger Sep 17, 2024
b1df7ff
remove alias names that are the same as the gene name
miseminger Sep 17, 2024
9303eb8
Temporarily add Pokay name for PLpro to aliases
miseminger Sep 18, 2024
d13e2cc
Add RdRp back in to alias list
miseminger Sep 18, 2024
220dc08
Add protein_alias lists manually
miseminger Sep 18, 2024
c55d674
Add ontology terms to functional annotation file
miseminger Sep 18, 2024
5f8cc07
Update column names to match template
miseminger Sep 18, 2024
fc78ab0
Don't merge on mat_pep anymore
miseminger Sep 18, 2024
e0896c8
add BioRegistry prefix for doi
miseminger Sep 18, 2024
668c0dc
Add MPOX ROBOT table
miseminger Sep 19, 2024
bd967d5
update for MPOX
miseminger Sep 27, 2024
fba32a9
Add ontology term JSONs for MPOX
miseminger Sep 27, 2024
a4f40bb
Add strand orientation for MPOX
miseminger Oct 1, 2024
5166d9e
add gene and strand orientation for MPOX
miseminger Oct 1, 2024
9a53e60
make code useful for SC2 or MPOX
miseminger Oct 1, 2024
2c62b20
Update SARS-CoV-2 JSONs with gene and strand orientations:
miseminger Oct 1, 2024
3043b5a
Add gene and strand orientation
miseminger Oct 1, 2024
363c410
Add new functional annotation file
miseminger Oct 1, 2024
9490474
add gene and strand orientation for MPOX, and change unknown publicat…
miseminger Oct 4, 2024
7671210
add MPOX functional annotations in DH template format
miseminger Oct 4, 2024
94a67bb
remove mutation index rows that don't have a nucleotide mutation
miseminger Oct 22, 2024
02ad2e1
use pd.explode() in unnest_multi()
miseminger Oct 22, 2024
48d068f
workaround unmatched list lengths to work with pd.explode()
miseminger Oct 22, 2024
dc7932d
Adapt vcf2gvf to new JSON keys
miseminger Oct 23, 2024
2734f7c
align attribute keys with DH template
miseminger Oct 24, 2024
8f415b5
align attribute keys with DH template
miseminger Oct 24, 2024
3bbe34d
adapt to new JSON keys
miseminger Oct 24, 2024
8b6307c
archive copy of gvf2tsv.py
miseminger Oct 24, 2024
58df48a
Produce 1 TSV from GVF, ignoring clades
miseminger Oct 24, 2024
85a6160
Update to match new GVF keys
miseminger Oct 24, 2024
ac59ae3
add gene_orientation and strand_orientation to gvf
miseminger Oct 24, 2024
49bbdba
change VP37 to be alias of OPG057 protein, and add to JSON
miseminger Oct 24, 2024
55186f2
remove VP37 mentions from MPXVgp025
miseminger Oct 24, 2024
977fc0f
add protein_alias key to all CDS entries
miseminger Oct 24, 2024
179844e
add indent to fix bug
miseminger Oct 24, 2024
496ca37
replace 'pokay' with 'template' for generalizability
miseminger Oct 24, 2024
a214a34
Rename functional annotation script
miseminger Oct 24, 2024
3ed9712
take 'doi:' off saved dois
miseminger Oct 24, 2024
673ad82
add definitely the latest version
miseminger Oct 25, 2024
d689609
match new DH format
miseminger Oct 25, 2024
8e68d0c
Delete assets/virus_functionalAnnotation/NC_045512.2/Pokay_functional…
miseminger Oct 25, 2024
41f8cc5
Delete assets/virus_functionalAnnotation/NC_045512.2/Pokay_functional…
miseminger Oct 25, 2024
4be0584
add 'clade' argument and attribute
miseminger Oct 25, 2024
33d97c7
Merge branch 'madeline-1' of github.com:cidgoh/nf-ncov-voc into madel…
miseminger Oct 25, 2024
422f97b
add 'functional_annotation_resource' attribute
miseminger Oct 25, 2024
b72ebcc
add functional annotation columns as GVF attributes
miseminger Oct 25, 2024
ce13700
change 'measured_variant_functional_effect_description' attribute to …
miseminger Oct 25, 2024
626dae7
adapt to new DH template format
miseminger Oct 25, 2024
377ad62
update column names
miseminger Oct 25, 2024
3c5bc08
Merge pull request #184 from cidgoh/madeline-1
anwarMZ Oct 25, 2024
9ed5809
catch MPOX genes and proteins
miseminger Oct 28, 2024
2803cc7
catch MPOX URLs
miseminger Oct 28, 2024
86f9d3c
upload MPOX functional annotation file
miseminger Oct 28, 2024
ef4c60c
add 'pmid:' prefix
miseminger Oct 28, 2024
aef4ab3
add 'pmid:' prefix
miseminger Oct 28, 2024
866bc56
Delete assets/virus_functionalAnnotation/NC_063383.1/pokay_annotation…
miseminger Oct 28, 2024
d7eecc3
Delete assets/virus_functionalAnnotation/NC_063383.1/pokay_annotation…
miseminger Oct 28, 2024
cfe4318
fixed typos
anwarMZ Oct 29, 2024
41a685e
solve duplicated rows bug
miseminger Nov 8, 2024
74d268e
remove troubleshooting print statement
miseminger Nov 8, 2024
54741fc
remove duplicate rows
miseminger Nov 8, 2024
254fe3b
Merge pull request #187 from cidgoh/madeline-2
anwarMZ Nov 11, 2024
f56e22d
Merge pull request #186 from cidgoh/madeline
anwarMZ Nov 12, 2024
194133e
solve duplication issue
miseminger Nov 13, 2024
76f7ba0
solve extra quotes issue
miseminger Nov 13, 2024
1c2cd51
add updated functional annotation file
miseminger Nov 13, 2024
b0462cd
Delete assets/virus_functionalAnnotation/NC_045512.2/Pokay_functional…
miseminger Nov 13, 2024
df038ca
ensure PMIDs are integers
miseminger Nov 13, 2024
1931c79
add PMIDs to functional annotation file
miseminger Nov 13, 2024
014d4a9
Delete assets/virus_functionalAnnotation/NC_045512.2/Pokay_functional…
miseminger Nov 13, 2024
4d4b8b3
Merge pull request #188 from cidgoh/madeline-1
anwarMZ Nov 13, 2024
99f27fc
update Mpox workflow
anwarMZ Nov 15, 2024
e03e34a
updated configuration workflow
anwarMZ Nov 15, 2024
d01bd08
updated configurations
anwarMZ Nov 15, 2024
876879a
updated README
anwarMZ Nov 15, 2024
25013e0
updated SARS-CoV-2 workflow
anwarMZ Nov 15, 2024
ca2de6a
updated main workflow file
anwarMZ Nov 15, 2024
a9a6764
updated ViralAi module
anwarMZ Nov 15, 2024
fed4883
updated annotation sub-workflow
anwarMZ Nov 15, 2024
0637c47
updated ignored files
anwarMZ Nov 15, 2024
6ff1b0c
added vscode plugins
anwarMZ Nov 15, 2024
a0e9e13
updated nextflow formatting
anwarMZ Nov 16, 2024
8de8f18
updated main.nf structure
anwarMZ Nov 16, 2024
2c0101a
updated main.nf structure
anwarMZ Nov 16, 2024
c3b8080
updated main.nf structure
anwarMZ Nov 16, 2024
08a977b
updated main.nf structure
anwarMZ Nov 16, 2024
58fb48c
fix missing file
miseminger Nov 26, 2024
1005647
try fixing 'alias' missing key
miseminger Nov 26, 2024
4dffc95
updated config
anwarMZ Nov 26, 2024
bf55044
updated subworkflows
anwarMZ Nov 26, 2024
fe51186
updated subworkflows
anwarMZ Nov 26, 2024
90995b7
updated modules
anwarMZ Nov 26, 2024
3846836
updated main file
anwarMZ Nov 26, 2024
7fe746c
updated workflows file
anwarMZ Nov 26, 2024
c7195b9
updated nextflow config
anwarMZ Nov 26, 2024
b80444d
vscode nextflow extension
anwarMZ Nov 26, 2024
2c38040
Merge pull request #189 from cidgoh/mri
anwarMZ Nov 26, 2024
847d30b
fix 'Pl_pro' alias error
miseminger Nov 28, 2024
1bdbb2c
Merge pull request #190 from cidgoh/mri_27
anwarMZ Nov 28, 2024
c2b571d
Make all index column names lowercase
miseminger Nov 29, 2024
ad48161
comment out index row removal before merge
miseminger Nov 29, 2024
4db8d73
capture mat_pep_acc from VCF
miseminger Nov 29, 2024
d9885a1
Merge pull request #191 from cidgoh/madeline
anwarMZ Nov 29, 2024
beac4f3
fix all relevant unequal length columns by adding nans
miseminger Nov 29, 2024
cb9d89d
Merge pull request #192 from cidgoh/madeline
anwarMZ Nov 29, 2024
a470b6a
update scripts
anwarMZ Dec 9, 2024
0df5172
updated conf
anwarMZ Dec 9, 2024
0315cf9
updated sub-workflows
anwarMZ Dec 9, 2024
2828b74
updated sars-cov-2 workflow
anwarMZ Dec 9, 2024
5e9179a
updated preset param files
anwarMZ Dec 9, 2024
aa97bfe
updated tsv2pdf module
anwarMZ Dec 9, 2024
7187817
version bump for pokay sars-cov-2
anwarMZ Dec 9, 2024
3739a4a
updated output files to match gvf naming
anwarMZ Dec 9, 2024
baeb989
update sars-cov-2 test data
anwarMZ Dec 9, 2024
c6b41b7
update sars-cov-2 test data
anwarMZ Dec 10, 2024
e1ecc63
updated covidmvp workflow
anwarMZ Dec 13, 2024
1f8f4e8
updated subworkflows
anwarMZ Dec 13, 2024
15b17c9
updated and formatted modules
anwarMZ Dec 13, 2024
4ccf0aa
updated merge_indices
anwarMZ Dec 13, 2024
d9edd75
update dtypes to match index columns
miseminger Dec 13, 2024
058630d
update index column names
miseminger Dec 13, 2024
fff4ad8
Merge pull request #193 from cidgoh/madeline3
anwarMZ Dec 13, 2024
40caf5e
archived script
anwarMZ Dec 13, 2024
8679c24
updated config
anwarMZ Dec 15, 2024
097770c
fixed covidmvp user uploaded error
anwarMZ Dec 15, 2024
acc7b7f
updated user uploaded test
anwarMZ Dec 15, 2024
279200f
updated preprocessing
anwarMZ Dec 15, 2024
bb23c62
updated github actions testing
anwarMZ Dec 16, 2024
62ef05d
fixed actions typo
anwarMZ Dec 16, 2024
de0348f
updated relative path to seq file
anwarMZ Dec 16, 2024
23a56e7
updated workflow params and conf
anwarMZ Dec 16, 2024
a540a57
updated test data for reference
anwarMZ Dec 16, 2024
2adca4c
updated test data
anwarMZ Dec 16, 2024
4f8583a
update module docker containers
anwarMZ Dec 16, 2024
fc4a5e9
update containers
anwarMZ Dec 16, 2024
dce1f37
update workflow
anwarMZ Dec 16, 2024
bdc1cd9
update workflow singularity
anwarMZ Dec 16, 2024
770e250
update workflow
anwarMZ Dec 16, 2024
f829c10
update containers
anwarMZ Dec 16, 2024
53eb35a
update containers
anwarMZ Dec 16, 2024
f6dcdf8
updated for master
anwarMZ Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Delete unwanted files
miseminger committed Sep 15, 2024

Verified

This commit was signed with the committer’s verified signature.
pnezis Panagiotis Nezis
commit 21d5b6ad741821150f76d233460cbd1ff67c76d3
156 changes: 0 additions & 156 deletions assets/virus_ontologyTerms/NC_045512.2/NC_045512.2_gene_terms.json

This file was deleted.

170 changes: 0 additions & 170 deletions assets/virus_ontologyTerms/NC_045512.2/NC_045512.2_protein_terms.json

This file was deleted.