Skip to content

Commit

Permalink
Merge branch 'master' of github.com:Acribbs/scflow
Browse files Browse the repository at this point in the history
  • Loading branch information
Acribbs committed May 22, 2023
2 parents 6e5c213 + bf7f800 commit 7ea87ae
Show file tree
Hide file tree
Showing 8 changed files with 490 additions and 362 deletions.
26 changes: 12 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,26 +14,20 @@ This repository contains a collection of pipelines that aid the analysis of sing

## Installation

### pip install

You can install scflow using pip, this will only install the package without any dependancies, which will have to be installed seperately.::

pip install scflow

### Conda installation - **in progress**

The preferred method for installation is through conda. Currently this installation is still in working progress. Preferably the
The preferred method for installation is through conda/mamba. Preferably the
installation should be in a seperate environment::

conda create -n scflow -c cgat scflow
mamba env create -f conda/environments/scflow.yml
conda activate scflow
scflow --help

### Manual installation
python setup.py develop

The repository can also be installed manually, but dependencies will need to be installed seperately::

python setup.py install
# Install a specific version of kb-tools from Adam's cloned repo
git clone git@github.com:Acribbs/kb_python.git
cd kb_python
python setup.py develop

scflow --help

## Usage
Expand Down Expand Up @@ -86,6 +80,10 @@ code can be found at [read the docs](http://single-cell.readthedocs.io/)

- [ ] [Overview of the seurat doublet-4 pipeline](docs/pipelines/seurat_doublet-4.md)

## seurat integration-5

- [ ] [Overview of the seurat doublet-4 pipeline](docs/pipelines/seurat_doublet-4.md)

# Project Info

- [ ] [Contributors](docs/project_info/Contributing.rst)
Expand Down
13 changes: 10 additions & 3 deletions conda/environments/scflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,17 +26,24 @@ dependencies:
- setuptools
- statsmodels
- r-base
- bioconductor-biomart
- bioconductor-singlecellexperiment
- r-optparse
- r-seurat
- r-tidyverse
- r-ggthemes
- bioconductor-scdblfinder
- scanpy
- scipy
- papermill
- leidenalg
- bioconductor-busparse
- bustools
- bioconductor-tximport
- bioconductor-dropletutils
- bioconductor-celda
- bioconductor-scdblfinder
- bioconductor-singlecellexperiment
- bioconductor-biomart
- bioconductor-scuttle
- bioconductor-glmgampoi
- r-harmony
- r-seuratdisk
- r-clustree
Empty file.
51 changes: 51 additions & 0 deletions docs/pipelines/seurat_integration-5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
## seurat integration-5

The pipeline follows pipeline quantnuclei, qc-1, filter-2 and optionally cluster-3.

It takes filtered Seurat objects and performs integrations by two methods, Seurat and Harmony.
Each dataset is first processed individually by SCTransform normalization and then perform integration and clustering by Seurat.
The integrated data could be furthered accepted by Harmony with integration/clustering.
The integrations performed by both the methods could be visualized as tSNE and UMAP dimensional reduction.

**Overview**
Runs the R script "seurat_integrate.R"
Runs the R script "harmony_integrate.R"
Runs Rmd file "Integration.Rmd" to visualise the results.

**Commands:**

Configure the pipeline.yml file

> scflow seurat doublet-4 config
Run the pipeline
> nohup scflow seurat doublet-4 make full -v5
### seurat_integrate.R

**Inputs:**
Filtered clustered seurat object.
(Or would work using filtered seurat object from output of filter-2.

**Steps:**
- Read in the .yml file
- Extract parameters from the .yml file
- Read in the samples
- Perform integration steps (creates a new seurat object which contains the integrated data):
- Normalise the data by SCTransform
- Select variable features that are common across the Seurat Objects
- Use PrepSCTIntegration to get residuals for all features
- Identify anchors
- Integrate the data
- Perform PCA
- Find Neighbours
- Find Clusters
- Perform UMAP
- Plot and save UMAPs
- Perform tSNE
- Plot and save tSNEs
- Save integrated object as RDS file

**Outputs**
UMAP and tSNE Plots
Integrated RDS object
29 changes: 18 additions & 11 deletions docs/pipelines/seurat_qc-1.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,26 +17,33 @@ The pipeline pipeline_qc-1.py runs an R markdown file called QC.Rmd to assess th
**Inputs:**

Count matrix generated from quantnuclei pipeline
Patient metadata as tab-delimited file


**Steps:**
1. Read in the count matrix
2. Create Seurat object, Normalize data, scale data and find variable features
3. Create some ggplot themes
4. Add columns to metadata: nUMI, nGene, log10GenesPerUMI
5. Identify mitochondrial genes and add to metadata
6. Map ensembl symbols to hgnc symbols using biomart
7. Filter out genes and cells with low number of counts
8. Create SingleCellExperiment object and save RDS files
2. Create Seurat object
3. Generate additional QC metrics: percent mitochondrial genes, log10GenesPerUMI
4. Save the mapping reference for ensemble gene names to gene IDs
5. Add gene symbols to the meta.features of the RNA assay
6. Add patient metadata
7. Convert to SingleCellExperiment
- Identify and remove empty droplets (NB empty droplets must be removed in order to perform doublet detection).
- Identify ambient RNA
- Identify doublets
8. Save Seurat Objects and SingleCellExperiment objects as RDS files (both complete and with emptry droplets filtered out)
9. Plot and save QC metrics
- Cell counts per sample
- UMI counts per cell
- Genes detected per cell
- UMIs vs genes detected
- Mitochondrial counts ratio
- UMIs vs genes detected
- Novelty

**Outputs:**

QC.Rmd knitted to html
SingleCellExperiment and Seurat Object RDS objects saved in RDS_objects.dir
QC plots saved as .eps files in QC_Figures.dir
QC.Rmd knitted to html

SingleCellExperiment and Seurat Object RDS objects saved in RDS_objects.dir/unfiltered
QC plots saved as .png files in QC_Figures.dir
mapping.txt saved in Files.dir
2 changes: 1 addition & 1 deletion scpipelines/R/plot_barnyard.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ library(optparse)

option_list = list(
make_option(c("-i", "--input"), type="character", default="",
help="input file GTF file [default= %default]", metavar="character"),
help="input file path for mtx gene matrix location [default= %default]", metavar="character"),
make_option(c("-o", "--out"), type="character", default="",
help="output file name [default= %default]", metavar="character")
);
Expand Down
Loading

0 comments on commit 7ea87ae

Please sign in to comment.