Conversion of one type of gene ID (such as SYMBOL or ENTREZ gene ID) into other types of IDs (such as ENSEMBL or UNIPROT)
Genomic and gene expression data is integral to biomolecular data analysis. The types of identifiers used for genes differ across different resources providing such data sets. The ability to use a single type of gene identifier is imperative for integrating data from two or more resources. This gene ID conversion tool facilitates the use of a common gene identifier. A tutorial provides an overview and the steps of how to use this tool.
This tool is available through the web at Gene ID Conversion and also as a REST API (SmartAPI for Gene ID Conversion). The SmartAPI page provides an explanation of the various parameters.
URLs to use for json output with CLI (e.g., using [curl -L 'URL']; use /View/txt for text output):
https://bdcw.org/geneid/rest/species/hsa/GeneIDType/SYMBOL_OR_ALIAS/GeneListStr/AIM1/View/json
https://bdcw.org/geneid/rest/species/hsa/GeneIDType/SYMBOL_OR_ALIAS/GeneListStr/IFNB2/View/json
https://bdcw.org/geneid/rest/species/hsa/GeneIDType/REFSEQ/GeneListStr/NM_001318095/View/json
https://bdcw.org/geneid/rest/species/hsa/GeneIDType/ENSEMBL/GeneListStr/ENSG00000136244/View/json
Please use __ (double underscore) or comma (,) to specify more than one gene, as in the string ITPR3__IL6__KLF4 or 3710,10365,3592,5743 in the example above. For SYMBOL like IDs, the user may specify SYMBOL_OR_ALIAS for GeneIDType, so that the term will be first searched in SYMBOL and if not found then it will be searched in ALIAS.
A python script provides an example of how to use the gene ID conversion tool. At the core, a URL-based query is constructed and executed using python packages “requests” and “bs4” (function “BeutifulSoup”). After some processing, the results are available as a pandas dataframe.
Example python script: https://bdcw.org/geneid/fetch_php_page.py