Skip to content

Simple script to check nearby genes using UCSC's Public Genome Browser's MySQL database

License

Notifications You must be signed in to change notification settings

mattcarras/python-genome-sql-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Summary

Simple script to check nearby genes using UCSC's Public Genome Browser's MySQL database given the chromosome, txStart, and txEnd starting reference points

Based on code from: http://genomewiki.ucsc.edu/index.php/Finding_nearby_genes

Requirements

  • python 3.x
  • pymysql
  • BeautifulTable

Configuration

Edit the variables between # CONFIGURE BELOW and # END CONFIG

Example output

closest 10 upstream transcripts from chr1:991973-991973 in hg19 for refGene set
Note: for reverse - strand items, txEnd is the 5' end, the transcription start site
+-------+---------+--------+--------+-----------------------------+------------+
| chrom | txStart | txEnd  | strand |            name             | geneSymbol |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 955502  | 991499 |   +    |          NM_198576          |    AGRN    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 948876  | 949919 |   +    |          NM_005101          |   ISG15    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 934341  | 935552 |   -    |          NM_021170          |    HES4    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 934343  | 935552 |   -    |        NM_001142467         |    HES4    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 901876  | 910484 |   +    |          NM_032129          |  PLEKHN1   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 901876  | 910484 |   +    |          NM_032129          |  PLEKHN1   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 901876  | 910484 |   +    |        NM_001160184         |  PLEKHN1   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 895966  | 901099 |   +    |          NM_198317          |   KLHL17   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 879582  | 894679 |   -    |          NM_015658          |   NOC2L    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 879582  | 894679 |   -    |          NM_015658          |   NOC2L    |
+-------+---------+--------+--------+-----------------------------+------------+


closest 10 downstream transcripts from chr1:991973-991973 in hg19 for refGene set
Note: for reverse - strand items, txStart is the 3' end, NOT the transcription start site
+-------+---------+---------+--------+----------------------------+------------+
| chrom | txStart |  txEnd  | strand |            name            | geneSymbol |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1007125 | 1009687 |   -    |        NM_001205252        |   RNF223   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1007125 | 1009687 |   -    |        NM_001205252        |   RNF223   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1017197 | 1051736 |   -    |         NM_017891          |  C1orf159  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1017197 | 1051736 |   -    |         NM_017891          |  C1orf159  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1017197 | 1051736 |   -    |         NM_017891          |  C1orf159  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1072396 | 1079434 |   +    |         NR_038869          | LOC254099  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1102483 | 1102578 |   +    |         NR_029639          |  MIR200B   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1103242 | 1103332 |   +    |         NR_029834          |  MIR200A   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1104384 | 1104467 |   +    |         NR_029957          |   MIR429   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1109285 | 1133313 |   +    |        NM_001130045        |   TTLL10   |
+-------+---------+---------+--------+----------------------------+------------+

About

Simple script to check nearby genes using UCSC's Public Genome Browser's MySQL database

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages