|
| 1 | + |
| 2 | +# cytoscope |
| 3 | + |
| 4 | +A simple Shiny app for visualizing single cell data. |
| 5 | + |
| 6 | +## 0. Structure |
| 7 | + |
| 8 | +This app expects the following directory structure with these naming conventions for samples, where `sn_id` corresponds to the ID of sample *n*: |
| 9 | + |
| 10 | +``` |
| 11 | +app.R |
| 12 | +get_genes.R |
| 13 | +data/ |
| 14 | + seurat_genes.Rda |
| 15 | + markers/ |
| 16 | + s1_id.markers.tsv |
| 17 | + ... |
| 18 | + s5_id.markers.tsv |
| 19 | + seurat/ |
| 20 | + s1_id.seurat_small.Rda |
| 21 | + ... |
| 22 | + s5_id.seurat_small.Rda |
| 23 | +``` |
| 24 | + |
| 25 | +You can have as many samples as you like! |
| 26 | + |
| 27 | +## 1. Add your data |
| 28 | + |
| 29 | +### Seurat objects |
| 30 | + |
| 31 | +Currently this works for SC data stored in seurat objects. Since these objects |
| 32 | +tend to be large and store a lot of data, most of which are not needed for the simple |
| 33 | +functions here, and generating smaller versions to be stored alongside this app help with speed. |
| 34 | +These should be saved in the `data` directory, each named as `sn_id.seurat_small.Rda`. |
| 35 | + |
| 36 | +#### Shrinking Seurat objects |
| 37 | + |
| 38 | +Here is an example function that takes a Seurat object as input, deletes some of the |
| 39 | +content not needed for the app, and returns the shrunken object: |
| 40 | + |
| 41 | +```r |
| 42 | + |
| 43 | +reduce_seurat <- function(seurat, |
| 44 | + n_pcs = 5, |
| 45 | + keep_raw_data = FALSE) { |
| 46 | + |
| 47 | + # Remove scale.data |
| 48 | + seurat@scale.data <- NULL |
| 49 | + |
| 50 | + # Remove raw data, only used for scaling and differential expression |
| 51 | + # as per https://satijalab.org/seurat/faq |
| 52 | + if (!keep_raw_data) seurat@raw.data <- NULL |
| 53 | + |
| 54 | + # Only keep data for the first few PCs |
| 55 | + seurat@dr$pca@cell.embeddings <- seurat@dr$pca@cell.embeddings[, 1:n_pcs] |
| 56 | + seurat@dr$pca@gene.loadings <- seurat@dr$pca@gene.loadings[, 1:n_pcs] |
| 57 | + |
| 58 | + if (!all(dim(seurat@dr$pca@gene.loadings.full) == 0)) { |
| 59 | + |
| 60 | + seurat@dr$pca@gene.loadings.full <- seurat@dr$pca@gene.loadings.full[, 1:n_pcs] |
| 61 | + |
| 62 | + } |
| 63 | + |
| 64 | + return(seurat) |
| 65 | + |
| 66 | +} |
| 67 | + |
| 68 | +``` |
| 69 | + |
| 70 | +#### Other requirements |
| 71 | + |
| 72 | +- The app will title certain plots using the `@project.name` slot in the Seurat |
| 73 | +objects |
| 74 | +- The app expects that a named character vector (names matching cluster names and |
| 75 | +values corresponding to colours) is stored at `@misc$colours` |
| 76 | + |
| 77 | +### Cluster markers |
| 78 | + |
| 79 | +The cluster markers can also be provided, allowing them to be searched/filtered |
| 80 | +in the app in a separate tab. The output of `Seurat::FindAllMarkers` for each sample can |
| 81 | +be saved as a TSV file and saved in the `markers` directory, named as `sn_id.markers.tsv`. |
| 82 | + |
| 83 | +These are the expected columns: |
| 84 | + |
| 85 | +``` |
| 86 | +p_val avg_logFC pct.1 pct.2 p_val_adj cluster external_gene_name ensembl_gene_id gene_biotype description |
| 87 | +``` |
| 88 | + |
| 89 | +The `server` function in the `app.R` script can be easily modified to suit the |
| 90 | +columns of your markers files, at the step which generates `output$markers`. |
| 91 | + |
| 92 | +## 2. Prepare `seurat_genes.Rda` |
| 93 | + |
| 94 | +We save the gene lists for each sample to allow them to be searched by the user |
| 95 | +in the app when visualizing expression. To prepare this list: |
| 96 | + |
| 97 | +1. Populating the `data` directory |
| 98 | +2. Modify the indicated lines in `get_genes.R` with your sample IDs |
| 99 | +3. Run `$ Rscript get_genes.R` from the top level of the directory storing your app |
| 100 | + |
| 101 | +## 3. Modify `app.R` |
| 102 | + |
| 103 | +Modify the`app.R` script at the indicated lines with your sample IDs. In the example script, |
| 104 | +there are two collections of samples - you can have as many collections as you like, |
| 105 | +each containing as many samples as you like. |
| 106 | + |
| 107 | +NOTE: The `sn_id` fields **must** match exactly the sample IDs used to name your data. |
| 108 | +The `Sample n name` fields can be friendlier versions of the sample IDs, or exactly the same; nothing depends on these and they are just for listing the available datasets. |
| 109 | + |
| 110 | +## 4. Open `app.R` in RStudio and hit `Run App`! |
0 commit comments