Consider to include Human Nephrogenesis Atlas expression values in manual curation of high-evidence genes #8

ninarank · 2023-06-27T14:09:08Z

Instead of or in addition to using Descartes fetal gene expression, consider using "Human Nephrogenesis Atlas".

The "Human Nephrogenesis Atlas" provides ssRNA seq data of two 14-weeks human kidneys (after QC: 8,316 cells (6,667 nephrogenic)).
They identified 18 cell clusters in the developing kidney. For each gene, you can get expression values of each cell cluster or of the total "Developmental Kidney". I suppose the "Developmental Kidney" expression value could be used if you only want 1 value per gene for the fetal kidney. This values is accessible via an API (script in progress).

So far, I did not find out, how exactly they calculate this expression value (which normalization technique etc.). I screened the original paper, but they didn't specify it there. I have not read the paper in detail, yet.

https://sckidney.flatironinstitute.org/
To get an example expression value: => Tab 'Gene Network' => select 'Developmental Kidney' => select gene of interest

ninarank · 2023-07-06T17:28:37Z

# This is a pre-version of a script using the Human Nephrogenesis Atlas - API 
# When querying one specific gene, the API queries some other genes of the gene functional network as well. Maybe this can be somehow avoided to make everything faster, but I didn't find out how. 
# TODO: Replace paths

library(httr)
library(jsonlite)
library(tidyverse)
library(progress)


# download HGNC gene table from github repository "kidney-genetics"
hgnc_gt_version <- "2023-06-21"
gene_table_url <- paste0("https://github.com/halbritter-lab/kidney-genetics/raw/main/analyses/A_AnnotationHGNC/results/non_alt_loci_set_coordinates.", hgnc_gt_version, ".csv.gz")
download.file(url = gene_table_url, destfile = paste0("gene_score/raw/HGNC_", hgnc_gt_version, ".csv.gz")) # TODO: Replace paths
gunzip(filename = paste0("gene_score/raw/HGNC_", hgnc_gt_version, ".csv.gz"), 
       destname = paste0("gene_score/raw/HGNC_", hgnc_gt_version, ".csv")) # TODO: Replace paths

# load HGNC gene table and filter for protein-coding genes 
HGNC_table <- read.csv("gene_score/raw/HGNC_2023-06-21.csv") %>% 
  filter(locus_group == "protein-coding gene") # TODO: Replace paths


# function for getting expression values via the Human nephrogenesis atlas API
get_nga_expression <- function(cell_cluster, entrez_id_vec){
  # set query url
  entrez_id_collapse <- paste(paste0("entrez=", entrez_id_vec), collapse = "&")
  url <- "https://sckidney.flatironinstitute.org/integrations/query/"
  query_url <- paste0(url, cell_cluster, "?", entrez_id_collapse)
  
  # set the header
  headers <- c("Content-Type" = "application/json")
  
  # get the response
  response <- GET(query_url, add_headers(headers))
  
  # get the response content
  content <- content(response, as="text", encoding = "UTF-8")
  content_list <- fromJSON(content)
  
  # select queried genes
  res <- data.frame(content_list$genes) %>% 
    filter(query == TRUE) %>% 
    dplyr::select(standard_name, systematic_name, entrez_id = entrez, expression_value) %>% 
    mutate(cell_cluster = cell_cluster)
  
  return(res)
}

# get expression values chunk wise (all genes at once not possible)
entrez_table <- HGNC_table %>% dplyr::select(entrez_id) %>% unique()
entrez_split_list <- split(entrez_table$entrez_id, ceiling(seq_along(entrez_table$entrez_id) / 250)) 
expr_vals_df <- data.frame()
cell_clusters <- c(57) # 57 = Developmental Kidney
pb <- progress_bar$new(total = length(entrez_split_list)*length(cell_clusters))


for (cell_cluster in cell_clusters){
  for (i in entrez_split_list){
    pb$tick()  # Increment progress bar
    cat("\n")
    sub_df <- get_nga_expression(cell_cluster, i)
    
    expr_vals_df <- dplyr::bind_rows(expr_vals_df, sub_df)
  }
}


View(expr_vals_df)

# write csv ...

berntpopp self-assigned this Nov 20, 2023

berntpopp added the enhancement New feature or request label Nov 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider to include Human Nephrogenesis Atlas expression values in manual curation of high-evidence genes #8

Consider to include Human Nephrogenesis Atlas expression values in manual curation of high-evidence genes #8

ninarank commented Jun 27, 2023 •

edited

Loading

ninarank commented Jul 6, 2023 •

edited

Loading

Consider to include Human Nephrogenesis Atlas expression values in manual curation of high-evidence genes #8

Consider to include Human Nephrogenesis Atlas expression values in manual curation of high-evidence genes #8

Comments

ninarank commented Jun 27, 2023 • edited Loading

ninarank commented Jul 6, 2023 • edited Loading

ninarank commented Jun 27, 2023 •

edited

Loading

ninarank commented Jul 6, 2023 •

edited

Loading