Parses, concludes and unites several Trans Proteomic Pipline (TPP) interact files and generates several Excel output files.
Parse-and-unite is a GUI-based software that combines and concludes several interact.pep.xml files into user-friendly xlsx files and Venn diagrams as well as calculation of ratio, standard deviation, and other parameters depending on the chosen mode. The merging process also include a filtration step based on false discovery rate and minimal number of matched ions.
This project includes modifications from this open-source code for Venn-diagrams : https://github.com/tctianchi/pyvenn
You may need to import several packages to run this program (using pip install or downloading manually via links below)
https://pypi.org/project/XlsxWriter/
https://pypi.org/project/matplotlib/
The main file is xml_parser_view.py, you should run this file only.
Click the "Browse" button and select all your pep.xml samples files (note- if you choose more than 6 files, the script will not generate Venn diagrams for these samples)
Choose your output file name, this will be the name of your united file
Choose your error rate, as percentage. Peptides with a probability lower than this FDR threshold will be excluded
Choose your running mode:
Quantify variable isotopic labelling difference.
Quantify peptide's peak area based on label free.
Quantify variable isotopic labelling difference of a single amino acid while having fixed terminal modifications.
Calculate variable modification occurrances.
When you all set, click the "Run" button.
On default mode you may see another window popping, asking you how to calculate your ratio (heavy/light or light/heavy)
When the program is finished, press "OK". Your source folder will open automatically and contain all your output files.
Four Venn diagrams- peptides intersections, protein intersections, PSM intersections and stripped intersections.
Example:
file-name_out.xlsx
These are the interim uniqe summary files for each output file.
Informative table for PSM level features (unite all samples in a single file).
Informative table for peptide level features (unite all samples in a single file).
Peak area was calculated by quantifying and summing all the PSMs peak areas, in the case of isotopic labels the united file contains a "ratio" column, which is the quotient of the light/ heavy areas. (modifications included in the peak area sum: oxidation (M), alkylation (C))
In addition to PSM filtration based on FDR (which can be controled from the GUI), it is also possible to change the threshold of minimal number of matched ions (default=5) by editing line 6 of parserPep.py (in Utils)