-
Notifications
You must be signed in to change notification settings - Fork 0
Dec 27th
Xueqing edited this page Jan 16, 2018
·
4 revisions
- Ok today I finally finished the cleaning-up of data.
Code: 02_cleaning_data_stage2.py
Result (only the first 20 lines are shown since the file is a bit huge):cleaned_data_sample.tsv
Statistics: 87 individuals, 190053 transcripts. - I also finished rewriting the transcript names into gene names.
Code: 03_transcripts_to_genes.py
Statistics: 87 individuals, 164802 transcripts (wrote in gene names) that can be mapped to genes. - Then I tried to sum up all the transcripts mapping to the same gene. Initially I wrote a code.
This takes a long time to run and still doesn't show the result ideally.
- Write codes to filter out the non-coding genes(06_proteinatlas_filter1.py) as well as not significantly differentially expressed genes(07_p_value_filter2.py).