Author: Kaitlin Sullivan
Please contact Kaitlin about usage and credit.
The goal of RUHi is to analyze and visualize mFISH! Stay tuned for exciting features such as integration with scRNA-seq data!
This repo contains the most recent version of RUHi.
You can install RUHi from this github repo with:
devtools::install_github("cembrowskilab/RUHi")
IF YOU RECIEVE THIS ERROR:
Using github PAT from envvar GITHUB_PAT
Error: Failed to install 'unknown package' from GitHub:
HTTP error 401.
Bad credentials
USE:
Sys.unsetenv("GITHUB_PAT")
devtools::install_github("cembrowskilab/RUHi")
See @kaitsull's repo for the developer's version.
Once installed, load the package normally:
library(RUHi)
If you are updating to a newer version of the repo:
#remove old version
remove.packages('RUHi')
#reinstall from here or from the kaitsull/RUHi github
devtools::install_github("cembrowskilab/RUHi")
We will be using a single section dataset from our eLife
paper
Raw files used for this analysis are directly from FIJI Quantification
and can be found in this
repo.
# Kaitlin Sullivan 2022
#after following the installation instructions...
#load package
library(RUHi)
#set the working directory using absolute path or here() function
#or simply provide full path in function
#check out what a function does via:
# ?ruMake (in the console)
We will use ruRead()
to read multiple FIJI Quantification files into a
single data frame. Then we can optionally use ruCombine()
to
concatenate multiple data frames from separate experiments.
First read your quantified gene tables from your analyzedTables
folder into a single data frame.
When using ruRead
please specify:
- region
: where is this image? (eg: "intermediate_claustrum")
- anum
: animal number
- section
: a unique identifying number for your image
#create a data frame
mydata <- ruRead("~/RUHi/inst/extdata", region = "intermediate", anum = "123456", section = "1")
#make sure all your genes are names correctly before continuing
#you should have columns named:
#X,Y,id,region,section,anum and all of your genes
head(mydata)
If you have multiple sections to analyze, read them individually into data frames and then use ruCombine()
as shown below:
#combine a list of data frames from ruRead()
combo <- ruCombine(list(data1, data2, data3))
3. Create an mFISH Object
This is a special class of object that encapsulates both raw and analyzed data as well as important metadata from the analysis. This makes for more reproducible analyses!
#turn an individual section or combination of sections into an object for analysis
myobj <- ruMake(mydata)
To "auto-analyze" your data, use goFISH()
. This function launches a ShinyApp that will allow you to easily test out variable values and visualize your analysis - as well as download .eps
versions of the figures.
Note: this function works best with a single image or a few combined, however it will begin to get quite slow the more data you put in! The number of principle components used (npc) will automatically be half the number of input genes and not optomized as in Section 5b.
#you have optional time-saving arguments that can pre-select the filtering value and number of clusters prior to running the app
goFISH(myobj, filter.by = 'Slc17a7', k = 5)
#when you are happy with the way your analysis looks, press "Download Object"
#to read back in your saved .RDS file, simply use:
myobj <- readRDS(path/to/object)
#here we filter for excitatory cells which are Slc17a7+
myobj <- ruFilter(myobj, filter.by = 'Slc17a7', threshold = 0.1)
Normalize and run a PCA.
#run normalization (with optional arg called remove.outliers to remove autofluorescent cells)
myobj <- ruProcess(myobj)
The code automatically takes PCs that contribute >95% variance to the dataset. You can use plotVar
to see variance contribution of each PC, the red line denotes the automatically selected number of pcs.
#plot variance -
#optional argument: metric = "stdev", "cumulative", "variance"
plotVar(myobj)
Learn more about UMAP here.
#run a UMAP - the values for which are found in myobj@attributes
myobj <- ruUMAP(myobj)
This will run Hierarchical Clustering on the data.
#populate metaData with cluster column
myobj <- ruCluster(myobj, k = 5)
Check out how the number of clusters looks in the dendrogram with plotDendro
. You can re-run the previous line to alter the number of clusters.
#plot dendrogram - optional argument: polar = T (makes circular)
plotDendro(myobj)
Plot in X,Y space with plotSpace()
:
Automatically coloured by cluster
plotSpace(myobj)
Optional argument group.by
to group by section, animal number, or other variable (eg cluster):
plotSpace(myobj, group.by = 'cluster')
Optional argument colour.by
to colour data by gene expression or metadata values:
#plot in space but change to a gene or metadata value
plotSpace(myobj, colour.by = 'Ctgf', include.fil = F)
Combine multiple arguments to get a better feel for how the data looks:
#plot in space with separation by cluster (group.by is useful for viewing multiple sections as well)
plotSpace(myobj, group.by = 'cluster', colour.by = 'Ctgf')
Plot in UMAP space with plotDim()
:
#auto coloured by cluster
plotDim(myobj)
This also has the colour.by
option for gene expression and metadata:
#option to colour by gene/metadata
plotDim(myobj, colour.by='Ctgf')
Plot expression of a gene in a given cluster using geneBoxPlot()
:
#coloured by cluster ID
geneBoxPlot(myobj, 'Ctgf')
Plot expression of all genes in a given cluster with clusterBoxPlot()
:
#autogenerated rainbow colouring scheme for genes
clusterBoxPlot(myobj, clus='5')
You can also print the expression of every gene in every cluster with this:
#autogenerated rainbow colouring scheme for genes
clusterBoxPlot(myobj)
#you can continually re-run these functions until you get an analysis that you are happy with
#it is HIGHLY SUGGESTED you save your object, this way you can share your data and all of the parameters used to get there
#### SAVE VIA: saveRDS(path, myobj)
#### READ IN VIA: myobj <- readRDS(path)
RUHi makes use of an mFISH Object
that encapsulates the many stages of
one’s analysis for easy reproducibility.
The mFISH Object
contains 4 main elements:
@rawData
: A data frame containing unfiltered non-normalized data@filteredData
: A data frame containing filtered and normalized data@metaData
: A data frame containing metadata for each cell@attributes
: A list containing all of the analysis values utilized
Each core function within the package interacts with the elements of the
object so you don’t have to.
However, if you wish to do more advanced analysis, you can access each
element by using the @
accessor (eg: object@metaData
). From there
you can subset like a regular data.frame
or list
with the $
accessor (eg: object@attributes$pca
).
Currently the package has 7 core functions, 4 plotting functions, and a Shiny App deployment function:
- takes quantified tables from FIJI and combines them into a table for analysis
- NOTE: if you do not specify region, section, anum, they will be
filled with values
NA
. It is highly suggested to fill in these optional arguments with- section = experiment number in quotations (eg: “4”)
- region = region images (eg: “anterior”)
- anum = animal number in quotations (eg: “123456”)
- takes multiple rounds from
ruRead()
and stitches them together with unique ids
- creates an mFISH object from tables generated by
ruRead()
orruCombine()
- populates the
@rawData
and@metaData
elements of the mFISH object - NOTE: if you are reading in a pre-existing dataframe with gene expression data, these tables must include the metadata columns: X,Y,id,section,region,anum
- filter data by a gene at a certain threshold
eg(
ruFilter(object, 'Slc17a7', 0.1)
) - populates the
@filteredData
element of the mFISH object
- normalize and run PCA
- optional argument to remove outliers that are autofluorescent and
therefore express every gene
remove.outliers = c(0,11)
would remove cells expressing no genes or expressing every gene (assuming 12 genes with one filtered out)
- alters the
@filteredData
element of the mFISH object
- run a UMAP on the PCA (with option to select number of pcs and alter the hyperparameters of the UMAP)
- cluster the data via hierarchical clustering
- populates a cluster column within the
@metaData
- launches the gone mFISHing shiny app
- App is best used for quickly looking through single sections as it gets slower computationally the larger your data is
- (NOTE: must use object generated via ruMake)
- plot an object in geographic space, coloured by any gene or metadata variable
- plot an object in dimensionally reduced space, coloured by any gene or metadata variable
- plot a boxplot of a given gene per cluster
- plot a boxplot of all the genes in a given cluster
loadData
, plotCluster
, plotGene
, plotViolin
You can also access function documentation via:
help(goFISH())
#or
?goFISH()