Skip to content

Commit a8ec6a3

Browse files
committed
Add detailed README and indicate lines to modify
1 parent 0fa8bd1 commit a8ec6a3

File tree

3 files changed

+119
-0
lines changed

3 files changed

+119
-0
lines changed

README.md

+110
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
2+
# cytoscope
3+
4+
A simple Shiny app for visualizing single cell data.
5+
6+
## 0. Structure
7+
8+
This app expects the following directory structure with these naming conventions for samples, where `sn_id` corresponds to the ID of sample *n*:
9+
10+
```
11+
app.R
12+
get_genes.R
13+
data/
14+
seurat_genes.Rda
15+
markers/
16+
s1_id.markers.tsv
17+
...
18+
s5_id.markers.tsv
19+
seurat/
20+
s1_id.seurat_small.Rda
21+
...
22+
s5_id.seurat_small.Rda
23+
```
24+
25+
You can have as many samples as you like!
26+
27+
## 1. Add your data
28+
29+
### Seurat objects
30+
31+
Currently this works for SC data stored in seurat objects. Since these objects
32+
tend to be large and store a lot of data, most of which are not needed for the simple
33+
functions here, and generating smaller versions to be stored alongside this app help with speed.
34+
These should be saved in the `data` directory, each named as `sn_id.seurat_small.Rda`.
35+
36+
#### Shrinking Seurat objects
37+
38+
Here is an example function that takes a Seurat object as input, deletes some of the
39+
content not needed for the app, and returns the shrunken object:
40+
41+
```r
42+
43+
reduce_seurat <- function(seurat,
44+
n_pcs = 5,
45+
keep_raw_data = FALSE) {
46+
47+
# Remove scale.data
48+
seurat@scale.data <- NULL
49+
50+
# Remove raw data, only used for scaling and differential expression
51+
# as per https://satijalab.org/seurat/faq
52+
if (!keep_raw_data) seurat@raw.data <- NULL
53+
54+
# Only keep data for the first few PCs
55+
seurat@dr$pca@cell.embeddings <- seurat@dr$pca@cell.embeddings[, 1:n_pcs]
56+
seurat@dr$pca@gene.loadings <- seurat@dr$pca@gene.loadings[, 1:n_pcs]
57+
58+
if (!all(dim(seurat@dr$pca@gene.loadings.full) == 0)) {
59+
60+
seurat@dr$pca@gene.loadings.full <- seurat@dr$pca@gene.loadings.full[, 1:n_pcs]
61+
62+
}
63+
64+
return(seurat)
65+
66+
}
67+
68+
```
69+
70+
#### Other requirements
71+
72+
- The app will title certain plots using the `@project.name` slot in the Seurat
73+
objects
74+
- The app expects that a named character vector (names matching cluster names and
75+
values corresponding to colours) is stored at `@misc$colours`
76+
77+
### Cluster markers
78+
79+
The cluster markers can also be provided, allowing them to be searched/filtered
80+
in the app in a separate tab. The output of `Seurat::FindAllMarkers` for each sample can
81+
be saved as a TSV file and saved in the `markers` directory, named as `sn_id.markers.tsv`.
82+
83+
These are the expected columns:
84+
85+
```
86+
p_val avg_logFC pct.1 pct.2 p_val_adj cluster external_gene_name ensembl_gene_id gene_biotype description
87+
```
88+
89+
The `server` function in the `app.R` script can be easily modified to suit the
90+
columns of your markers files, at the step which generates `output$markers`.
91+
92+
## 2. Prepare `seurat_genes.Rda`
93+
94+
We save the gene lists for each sample to allow them to be searched by the user
95+
in the app when visualizing expression. To prepare this list:
96+
97+
1. Populating the `data` directory
98+
2. Modify the indicated lines in `get_genes.R` with your sample IDs
99+
3. Run `$ Rscript get_genes.R` from the top level of the directory storing your app
100+
101+
## 3. Modify `app.R`
102+
103+
Modify the`app.R` script at the indicated lines with your sample IDs. In the example script,
104+
there are two collections of samples - you can have as many collections as you like,
105+
each containing as many samples as you like.
106+
107+
NOTE: The `sn_id` fields **must** match exactly the sample IDs used to name your data.
108+
The `Sample n name` fields can be friendlier versions of the sample IDs, or exactly the same; nothing depends on these and they are just for listing the available datasets.
109+
110+
## 4. Open `app.R` in RStudio and hit `Run App`!

app.R

+7
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,18 @@ ui <- fluidPage(
1919
h3("Data"),
2020
selectInput("sample", "Dataset", multiple = FALSE, selected = "ct_p3",
2121
choices = list(
22+
23+
# ********************
24+
# *** MODIFY THIS ****
25+
# ********************
26+
2227
"Sample collection 1" = c("Sample 1 name" = "s1_id",
2328
"Sample 2 name" = "s2_id"),
2429
"Sample collection 2" = c("Sample 3 name" = "s3_id",
2530
"Sample 4 name" = "s4_id",
2631
"Sample 5 name" = "s5_id")
32+
33+
2734
)),
2835
selectInput("gene", "Genes (max 3)", choices = character(0), multiple = TRUE),
2936
selectInput("dr", "Dimensionality reduction", multiple = FALSE, choices = c("tsne", "pca"), selected = "tsne"),

get_genes.R

+2
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,10 @@ for (i in seq_along(seurat_obj)) {
99

1010
}
1111

12+
# ** MODIFY THIS **
1213
samples <- list(s1_id, s2_id, s3_id, s4_id, s5_id)
1314

15+
# ** MODIFY THIS **
1416
names(samples) <- c("s1_id",
1517
"s2_id",
1618
"s3_id",

0 commit comments

Comments
 (0)