Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add annotations for KEGG pathways and genes #107

Open
NantiaL opened this issue May 3, 2022 · 4 comments
Open

Add annotations for KEGG pathways and genes #107

NantiaL opened this issue May 3, 2022 · 4 comments
Labels
2.2 feature Issues that aim to introduce new feature in ModelPolisher.

Comments

@NantiaL
Copy link

NantiaL commented May 3, 2022

It would be very useful to have annotations for KEGG subsystems and genes assigned by the ModelPolisher. This would save a lot of time and work during model curation.

To obtain genes' annotations one of the following methods could be used:

  • the GenBank file combined with old/new locus tags
  • the NCBI API
  • the KEGG API and the conv operation
@NantiaL NantiaL changed the title Add KEGG pathways and gene annotations Add annotations for KEGG pathways and genes May 3, 2022
@matthiaskoenig
Copy link
Collaborator

Just to comment on this. KEGG is basically dead for many researchers since moving behind a non-open license. This makes it basically impossible to work with KEGG data and annotations. Many researchers dropped KEGG in recent years (as did I). I would not recommend to put any effort in supporting KEGG, but instead use open alternatives such as reactome.

@Schmoho Schmoho added enhancement feature Issues that aim to introduce new feature in ModelPolisher. and removed enhancement labels May 3, 2022
@Schmoho
Copy link
Collaborator

Schmoho commented Jul 19, 2024

I am not quite sure I understand this correctly. @NantiaL could you maybe provide an example of what you are asking for?

@NantiaL
Copy link
Author

NantiaL commented Jul 23, 2024

With this I mean a simple annotation of genes within the model. Given a RefSeq annotation file with multiple entries like gene, CDS etc.:

NC_XXX.1	RefSeq	gene	12508	13482	.	+	.	ID=gene9;Dbxref=GeneID:4917798;Name=XXX;gbkey=Gene;gene_biotype=protein_coding;locus_tag=XXX
NC_XXX.1	RefSeq	CDS	12508	13482	.	+	0	ID=cds9;Parent=gene9;Dbxref=Genbank:YP_XXX.1,GeneID:XXX;Name=YP_XXX.1;gbkey=CDS;product=XXX;protein_id=YP_XXX.1;transl_table=11

extract information e.g., locus tag, gene ID, and name, and add them as annotations (CV terms) in the model. Similar information can be extracted also from the GenBank annotation file.

This would require to introduce a new input parameter, meaning the user will need to provide the .gff file while executing ModelPolisher.

@Schmoho Schmoho added the 2.1 Issue in the 2.1 branch label Jul 23, 2024
@GwennyGit
Copy link

GwennyGit commented Jul 25, 2024

We already implemented something similar in refineGEMs. 🤔
See function cv_ncbiprotein

The section Functions to add additional URIs to GeneProducts in the same module also contains functions to add more annotations to the GeneProducts.

@Schmoho Schmoho added 2.2 and removed 2.1 Issue in the 2.1 branch labels Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.2 feature Issues that aim to introduce new feature in ModelPolisher.
Projects
None yet
Development

No branches or pull requests

4 participants