GitHub - QCSB/PROSO-Toolbox

PROSO Toolbox

A Computational Toolbox for Context-Specific Genome-Scale Modelling
Project Wiki »
Report Issues »

NOTE: Please always use and refer to QCSB Release

Table of Contents

About PROSO
Getting Started
- Prerequisites
- Installation
Usage
Roadmap
Cite PROSO
License
Contact

About PROSO

PROSO Toolbox is a collection of functions used to process, interpret, and study cellular multi-omics data under the scope of genome-scale modelling (GEM).

What PROSO Toolbox offers:

Automatic implementing protein constraints to any genome-scale metabolic model (M-model)
System-level enzymatic constant estimation
Incorporating gene expression data onto GEM for context-specific modelling
Suggesting synthetic biology strategies for biotechnology, infectious disease, cancer research, and more

More information on PROSO Toolbox's intuition, formulation, and execution is available in our publication.

(back to top)

Getting Started

PROSO Toolbox can be setup easily as follows.

Prerequisites

This is an example of how to list things you need to use the software and how to install them.

MATLAB (R2015a or later). Extra add-ons:
- Bioinformatics Toolbox
- Statistics and Machine Learning Toolbox
- etc.
COBRA Toolbox
- Please refer to openCOBRA for details on installation and troubleshooting.
An Optimization Solver. We only support:
- Gurobi Optimizer (Preferred)
- IBM CPLEX Optimization Studio

Installation

Clone the current repo to your PC
In MATLAB command window, add PROSO directory to path
```
>> addpath("Path-to-PROSO-Folder")
>> savepath
```
It's good to go

(back to top)

Usage

Here we only demonstrate a simple PC-model construction from Pseudomonas aeruginosa M-model. Despite not being context-specific by itself, PC-model is an 'upgraded' M-model and can serve important purposes in research.

Prepare Data or find them under PROSO/tutorial. Make sure they are in path or in your working directory:
- P. aeruginosa metabolic reconstruction iSD1509 (doi: https://doi.org/10.1101/2021.04.15.439930)
- Download P. aeruginosa protein sequence FASTA (.faa): Pseudomonas Genome DB
Construct draft PC-model from M-model
- Open MATLAB, make sure all installations are done correctly. Initialize Cobra Toolbox and change the default solver to Gurobi (or IBM CPLEX).
```
>> initCobraToolbox(false);
>> changeCobraSolver('gurobi','all',0);
```
- Construct the draft PC-model
  
  We are implementing protein constraints onto iSD1509, with a protein budget of 150mg/gDW.
```
>> model_ori = readCbModel('iSD1509.xml');
>> [model_pc_draft,fullProtein,fullCplx,C_matrix,K_matrix,proteinMM] = pcModel(model_ori,'Pseudomonas_aeruginosa_UCBPP-PA14_109.faa',150);
```
  This will take several minutes to complete.
  
  The M-model has 1510 genes (with one dummy gene), 1642 metabolites, and 2023 reactions.
  
  Note that the resulting draft PC-model has 7519 'metabolites' (1642 true metabolites + 1510 proteins + 1250 complexes + 1558 forward enzymes + 1558 reverse enzymes + proteinWC) and 12487 'reactions' (2023 true reactions + 1510 protein dilutions + 1250 complex formations + 4588 enzyme formations + 3116 enzyme dilutions). This structure will not be changed during tuning, only the coefficient will be modified.
Tune the draft PC-model for better performance
- Manually adjust protein complex stoichiometry
  
  This step is usually conducted using some database. For example, from MetaCyc PA14 database we can extract complex information to curate the draft PC-model. It is important for the user to appropriately appreciate the accuracy of each source, as almost nothing is guaranteed completely accurate.
  
  ATP synthase complex is a large protein complex with 9 subunits. Use surfNet to inspect it in PC-model:
```
>> surfNet(model_pc_draft,'cplxForm_x(193)x(197)x(195)x(198)x(200)x(199)x(196)x(194)x(192)');
```
  You can use keep track of complex -> enzyme -> reaction to make sure it is the ATPS complex, or going in reverse direction to find complexes for a certain reaction.
  
  For example, If I want to change it so each one of ATPS complex has two copies of subunit alpha (atpA, PA14_73260), I first need to locate both complex and protein in their respective list:
```
>> pIdx = find(strcmp(fullProtein,'PA14_73260'));
>> cIdx = find(C_matrix(pIdx,:));
```
  The change to make is protein #193 and complex #178. I change the coefficient from 1 to 2:
```
>> C_matrix(pIdx,cIdx) = 2;
```
  I want to finish all subunit modifications before proceed to next step.
- Estimate enzymatic rate constants using SASA
  
  Now we have modified all protein complexes (C_matrix), their rate constants can be automatically estimated as below.
```
>> K_matrix = estimateKeffFromMW(C_matrix,K_matrix,proteinMM);
```
  This gives us an updated kinetic matrix to implement.
- Update PC-model
  
  Implement new C_matrix and K_matrix back to PC-model.
```
>> model_pc = adjustStoichAndKeff(model_pc_draft,C_matrix,K_matrix);
```
  This will take some time to complete.
What does PC-model does

PC-model 'soft-cap' the system-level activity by constraining the total amount of proteins in the system.
```
>> FBAsol = optimizeCbModel(model_ori,'max');
>> FBAsol_pc = optimizeCbModel(model_pc,'max');
```
The optimal growth rate of PC-model (FBAsol_pc.v) is smaller than the one of M-model (FBAsol.v). In general, PC-FBA better resembles organism's true exponential phase metabolism.

These are only the most basic functions. For more examples, please refer to the Project wiki

(back to top)

Roadmap

PROSO is a on-going project with future plans to refine and expand the scope.

(back to top)

Cite PROSO

Please cite our latest publications:

Yao, H., Dahal, S., & Yang, L. (2023). Novel context-specific genome-scale modelling explores the potential of triacylglycerol production by Chlamydomonas reinhardtii. Microbial Cell Factories, 22(1), 1-16.
Yao, H., & Yang, L. (2023). PROSO Toolbox: a unified protein-constrained genome-scale modelling framework for strain designing and optimization. arXiv preprint arXiv:2308.14869.

(back to top)

License

Distributed under GNU GENERAL PUBLIC LICENSE V3. Please see LICENSE for more information.

(back to top)

Contact

Herbert Yao - 16hy16@queensu.ca

Laurence Yang - laurence.yang@queensu.ca

Queen's Computational Systems Biology Group, Department of Chemical Engineering, Queen's University at Kingston, Canada

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
doc		doc
img		img
src		src
test		test
tutorial		tutorial
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
initializePROSO.m		initializePROSO.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PROSO Toolbox

About PROSO

Getting Started

Prerequisites

Installation

Usage

Roadmap

Cite PROSO

License

Contact

About

Releases 3

Packages

Languages

License

QCSB/PROSO-Toolbox

Folders and files

Latest commit

History

Repository files navigation

PROSO Toolbox

About PROSO

Getting Started

Prerequisites

Installation

Usage

Roadmap

Cite PROSO

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages