Skip to content

Commit 3a25a5c

Browse files
authored
Update README.md
1 parent 41a1be3 commit 3a25a5c

File tree

1 file changed

+79
-74
lines changed

1 file changed

+79
-74
lines changed

README.md

+79-74
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,15 @@ DigestR is an open-source software developed for the R statistical language envi
1010
Users can interact with DigestR in two major ways: via point and click graphical user interfaces (GUIs) or by entering command directly in the R console.
1111
This guide is intended to give an overview of DigestR's functions.
1212

13+
To generate coincidence maps, DigestR requires:
14+
- A reference proteome (see function gp())
15+
- A .csv file exported from Mascot (see the converting file section)
16+
1317
## How to install digestR package from GitHub
1418
DigestR was tested with `R v4.3.1`, and `R v4.4.1`.
1519

16-
We recommend using R '4.3' or later versions. In case of dependencies issues, use the renv package to reproduce the exact environment used.
17-
See section 'Reproducing R environment'
20+
In case of dependencies issues, use the renv package to reproduce the exact environment used.
21+
See section 'Reproducing R environment'below.
1822

1923
### Prerequisites
2024

@@ -43,55 +47,13 @@ Install the the digestR package directly from GitHub
4347

4448
library(digestR)
4549

46-
## Reproducing the R Environment
47-
48-
This project uses `renv` to manage package dependencies. To reproduce the exact environment used:
49-
### Step 1: Install and Load renv
50-
```sh
51-
install.packages("renv")
52-
library(renv)
53-
```
54-
### Step 2: Clone the Repository
55-
You can clone the repository using your system's terminal. Run the following command:
56-
```sh
57-
git clone https://github.com/LewisResearchGroup/DigestR.git
58-
cd digestR
59-
```
60-
Alternatively, if you want to clone the repository from within R, you can do so using system commands:
61-
```sh
62-
system("git clone https://github.com/LewisResearchGroup/DigestR.git")
63-
```
64-
### Step 3: Navigate to the Project Directory
65-
Change the working directory to where you cloned the repository:
66-
```sh
67-
setwd("path/to/DigestR")
68-
```
69-
Replace "path/to/DigestR" with the actual path where the repository was cloned.
70-
71-
### Setp 4: Initialize renv
72-
```sh
73-
renv::init()
74-
```
75-
### Step 5: Restore the Project Environment
76-
```sh
77-
renv::restore()
78-
```
79-
### Step 6: Install the Package
80-
```sh
81-
devtools::install()
82-
```
83-
### Step 7: load the Package
84-
```sh
85-
library(DigestR)
86-
```
87-
8850
## GUI Functions Documentation
8951

9052
This document provides details of several Graphical User Interface (GUI) functions implemented in the R programming language using the Tcl/Tk toolkit.
9153

9254
### Supported file formats
9355

94-
DigestR supports Mascot (.csv) generated files. These files may be converted to .dcf files using DigestR file conversion functions pm().
56+
DigestR supports Mascot (.csv) generated files. These files must be converted to .dcf files using DigestR file conversion functions pm().
9557

9658
### Converting files.
9759

@@ -107,7 +69,7 @@ By default, Mascot creates a header, this 3 line header is required for the .csv
10769

10870
An example data file can be found here: https://github.com/LewisResearchGroup/digestR/blob/main/Example%20Files/Data_Example.csv
10971

110-
### Generating Proteome: gp()
72+
### Generating a reference proteome: gp()
11173
The generate_proteome function streamlines the process of accessing and downloading protein data from Ensembl BioMart, facilitating the creation of proteomes for comparison against experimental peptides. To generate a new proteome, users begin by selecting their desired Biomart library, using the dropdown menu – options include "genes" or "ensembl," with "genes" being the default value.
11274

11375
Following this, users input a search pattern to explore datasets within the BiomaRt database (e.g., "sapiens" or "taurus"). Upon clicking the "Search Datasets" button, the function connects to the BiomaRt servers and retrieves datasets matching the provided pattern. The outcomes are displayed in the "Dataset Results" listbox, showing the dataset names, descriptions, and versions. Double-clicking on a result selects the dataset for further processing.
@@ -120,42 +82,26 @@ Warning: Generating proteomes can take several minutes to several hours dependin
12082
For convernience, some proteomes have already been generated and can be found here:
12183
https://github.com/LewisResearchGroup/digestR/blob/main/Example%20Files/Human_proteome.csv
12284
https://github.com/LewisResearchGroup/digestR/blob/main/Example%20Files/Pfal_3D7_proteome.csv
85+
https://github.com/LewisResearchGroup/digestR/blob/main/Example%20Files/Btaurus_proteome.csv
86+
87+
### Creating and Manipulating digestion maps
12388

124-
### Manipulating .csv files
89+
Required file: .dcf
12590

12691
#### 1. Process mascot files: pm()
12792
To create "digestion" maps, peptides identified by Mascot or MaxQuant need to be mapped to their proteomic location. First, the user needs to select a proteome to align peptides against (see Generate Proteome). DigestR will automatically detect and utilize all proteomes located within the "data/proteomes" subfolder. Users can also import their own proteomes into this subfolder. After proteome selection, users can align Mascot identified peptides along the selected proteome from a single or multiple files. These alignments generate "coincidence" or "digestion" maps that users can interact with.
12893

12994
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/Process%20Mascot%20GUI.png)
13095

131-
#### 2. Cleavage site specificity: csd()
132-
The csd() function allows users to plot amino acid distributions at C-terminus or N-terminus to track changes in cut site representation/specificity between groups. This function allows users to select a file to generate either a logo plots of the P4-P4' positions or bar plots at the P1 (Nterminus) or P1' (Cterminus) position. To identify cleavage sites of biological significance, it is possible to normalize the distribution with a specific amino acid sequence. Users can directly import a protein sequence in the appropriate box. The function then calculates the representation frequency for amino acid within the protein sequence to normalize the experimental amino acid cut-site distributions.
133-
134-
![](https://github.com/LewisResearchGroup/digestR/blob/main/Images/LogoPlot_function.png)
135-
136-
#### 3. Peptide length distribution: pd()
137-
Defect in proteolytic activity might have an impact on digested peptide length. Therefore, DigestR was developed to calculate and plot peptide length distributions in amino acids using the pd() command. Users can select a folder or subfolder and process all CSV files in that directory. Files can be selected directly in the loaded files box. If no files are selected, all files will be used to generate the density plots. At least two files need to be imported in order to generate Venn diagrams Files will be grouped depending on the second string of the filename. Three types of density plots from grouped CSV files can be chosen by the user: Overlay, Ridges, and Colored Ridges.
138-
139-
![](https://github.com/LewisResearchGroup/digestR/blob/main/Images/peptide_distribution_function.png)
140-
141-
#### 4. Venn Diagrams: vd()
142-
DigetR also allows for the creation of Venn diagrams in order to analyze peptide overlaps between groups. The vd function allows for users to import files contained in a specific folder and generate Venn diagram. Files can be selected directly in the loaded files box. If no files are selected, all files will be used to generate the Venn diagram . At least two files need to be imported in order to generate Venn diagrams
143-
144-
![](https://github.com/LewisResearchGroup/digestR/blob/main/Images/VennDiagram_function.png)
145-
146-
### Viewing and interacting with Digestion maps.
147-
148-
Required files: .dcf
149-
150-
#### 1. Opening digestion maps: fo() / fs()
96+
#### 2. Opening digestion maps: fo() / fs()
15197
To open a "digestion" map in DigestR, either select "Open/Close Files" from the File menu or use the commands fo() or fs() in the R console. If multiple files have been opened, only the most recently opened spectrum will appear in the main plot window. To switch to another spectrum, double-click on a file name within the GUI. To close one or more files, select the desired files from the table and then press the "Close file" button.
15298

153-
#### 2. Manipulate dcf files: mf()
99+
#### 3. Manipulate dcf files: mf()
154100
The mf() function in DigestR allow users to perform various mathematical operations with dcf files, facilitating comprehensive data manipulation and analysis. With mf(), users can add, substract, multiply, merge and divide, the data contained in multiple dcf files. This functionality allows users to perform mathematical operations tailored to their specific research needs, streamlining data processing and enhancing the overall analytical capabilities of DigestR.
155101

156102
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/Manipulate%20files%20GUI.png)
157103

158-
#### 3. Plot settings: ct()
104+
#### 4. Plot settings: ct()
159105
The ct() function allows users to interact with the "digestion" map directly through the graphical interface. Users can display the "digestion" map either at a proteome or protein level. By default, the full proteome view is displayed.
160106

161107
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/CT%20GUI.png)
@@ -165,17 +111,17 @@ To display the digestion map of a specific protein, click on "Display Single Gen
165111

166112
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/CT_GENE%20GUI.png)
167113

168-
#### 4. Plot colors: co()
114+
#### 5. Plot colors: co()
169115
The plot color function allows users to easily manipulate the plot colors. To open the plot color GUI, enter the command co(). Color preferences can be applied to multiple spectra simultaneously by selecting names from the files list. Plot color options for the selected files may be configured individually using the buttons provided on the right side of the GUI. The "Axes" button changes the color of the x and y axes, "BG" changes the background color, and "Peak labels" changes the label color of identified peaks.
170116

171117
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/Color%20GUI.png)
172118

173-
#### 5. Overlays: ol()
119+
#### 6. Overlays: ol()
174120
DigestR allows multiple "digestion" maps to be displayed concurrently on a single plot through the command ol(). To add or remove loaded files, select the digestion maps to overlay and click the "add" or "remove" buttons. The order of overlaid maps in the main plot window is taken directly from the order of digestion maps appearing in the overlays list box. Individual files can be assigned their own colors. The plot legend will be automatically generated, but it can be suppressed by unchecking the "Display names of the overlay spectrum on the plot" option. Similarly, the path of "digestion" maps can be suppressed by checking the corresponding checkbox.
175121

176122
![](https://github.com/LewisResearchGroup/digestR/blob/main/Images/Overlay_function.png)
177123

178-
#### 6. Display protease cut sites: cs()
124+
#### 7. Display protease cut sites: cs()
179125
Users can overlay known protease cut sites onto the "digestion" map(s) using the cs() command. It is important to note that this function requires a CSV file containing the names of the proteases and their respective cleavage sites. An example CSV file can be found here: https://github.com/LewisResearchGroup/digestR/blob/main/tests/Proteasecutsiteslist.csv
180126

181127
| protease | abv | cutsites | X | X.1 | X.2 | X.3 | X.4 |
@@ -194,17 +140,76 @@ Once the CSV file is loaded, users have to option to select a specific protease
194140

195141
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/cs%20GUI%20-%20Copy.png)
196142

197-
#### 7. Zoom: zm()
143+
#### 8. Zoom: zm()
198144
DigestR includes various zooming and scrolling commands, accessible through the zoom GUI by selecting "Zoom" from the View menu or using the command zm(). Digestion maps can be navigated using the arrow pad provided in the zoom GUI or by using the five distinct zoom functions called by the buttons provided on the right side of the zoom GUI. Many of these functions are iterative and must be exited by right-clicking in the main plot window.
199145

200146
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/Zm%20GUI.png)
201147

202-
#### 8. Gene labelling: gl()
148+
#### 9. Gene labelling: gl()
203149
The gl() function allows users to override the threshold at which proteins are labeled when viewing data on the proteome-wide level. By lowering the default value, more peptides will be labeled.
204150

205151
![](https://github.com/LewisResearchGroup/DigestR/blob/main/Images/gl%20GUI.png)
206152

153+
### Investigating proteolytic activities.
154+
155+
Required files: .csv
156+
157+
#### 2. Cleavage site specificity: csd()
158+
The csd() function allows users to plot amino acid distributions at C-terminus or N-terminus to track changes in cut site representation/specificity between groups. This function allows users to select a file to generate either a logo plots of the P4-P4' positions or bar plots at the P1 (Nterminus) or P1' (Cterminus) position. To identify cleavage sites of biological significance, it is possible to normalize the distribution with a specific amino acid sequence. Users can directly import a protein sequence in the appropriate box. The function then calculates the representation frequency for amino acid within the protein sequence to normalize the experimental amino acid cut-site distributions.
159+
160+
![](https://github.com/LewisResearchGroup/digestR/blob/main/Images/LogoPlot_function.png)
161+
162+
#### 3. Peptide length distribution: pd()
163+
Defect in proteolytic activity might have an impact on digested peptide length. Therefore, DigestR was developed to calculate and plot peptide length distributions in amino acids using the pd() command. Users can select a folder or subfolder and process all CSV files in that directory. Files can be selected directly in the loaded files box. If no files are selected, all files will be used to generate the density plots. At least two files need to be imported in order to generate Venn diagrams Files will be grouped depending on the second string of the filename. Three types of density plots from grouped CSV files can be chosen by the user: Overlay, Ridges, and Colored Ridges.
164+
165+
![](https://github.com/LewisResearchGroup/digestR/blob/main/Images/peptide_distribution_function.png)
166+
167+
#### 4. Venn Diagrams: vd()
168+
DigetR also allows for the creation of Venn diagrams in order to analyze peptide overlaps between groups. The vd function allows for users to import files contained in a specific folder and generate Venn diagram. Files can be selected directly in the loaded files box. If no files are selected, all files will be used to generate the Venn diagram . At least two files need to be imported in order to generate Venn diagrams
169+
170+
![](https://github.com/LewisResearchGroup/digestR/blob/main/Images/VennDiagram_function.png)
171+
172+
## Reproducing the R Environment
207173

174+
This project uses `renv` to manage package dependencies. To reproduce the exact environment used:
175+
### Step 1: Install and Load renv
176+
```sh
177+
install.packages("renv")
178+
library(renv)
179+
```
180+
### Step 2: Clone the Repository
181+
You can clone the repository using your system's terminal. Run the following command:
182+
```sh
183+
git clone https://github.com/LewisResearchGroup/DigestR.git
184+
cd digestR
185+
```
186+
Alternatively, if you want to clone the repository from within R, you can do so using system commands:
187+
```sh
188+
system("git clone https://github.com/LewisResearchGroup/DigestR.git")
189+
```
190+
### Step 3: Navigate to the Project Directory
191+
Change the working directory to where you cloned the repository:
192+
```sh
193+
setwd("path/to/DigestR")
194+
```
195+
Replace "path/to/DigestR" with the actual path where the repository was cloned.
196+
197+
### Setp 4: Initialize renv
198+
```sh
199+
renv::init()
200+
```
201+
### Step 5: Restore the Project Environment
202+
```sh
203+
renv::restore()
204+
```
205+
### Step 6: Install the Package
206+
```sh
207+
devtools::install()
208+
```
209+
### Step 7: load the Package
210+
```sh
211+
library(DigestR)
212+
```
208213

209214

210215

0 commit comments

Comments
 (0)