Skip to content

Reconstruction and augmentation of the Wang B. et al. (2021) paper's computational framework for dissecting metastatic gastric cancer through their provided single-cell transcriptomics dataset.

License

Notifications You must be signed in to change notification settings

BulutHamali/scRNAseq-Pipeline-GastricCancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Single-Cell RNA Sequencing Analysis Pipeline for Gastric Cancer (in progress)

Project Overview

This repository presents an advanced computational pipeline for analyzing single-cell RNA sequencing (scRNA-seq) data from gastric cancer samples. Our analysis framework builds upon and extends the methodologies discussed in recent gastric cancer research, particularly focusing on the GEO dataset GSE158631. The pipeline integrates state-of-the-art techniques for cell population identification, gene expression analysis, and pathway enrichment.

Key Features

  • Comprehensive preprocessing and quality control
  • Multi-modal dimensionality reduction analysis
  • Advanced clustering algorithms implementation
  • Automated marker gene identification
  • Gene Ontology (GO) enrichment analysis
  • Interactive visualization tools
  • Reproducible workflow

Installation and Setup

Clone the Repository

git clone git@github.com:BulutHamali/scRNAseq-Pipeline-GastricCancer.git

Dependencies Required Python packages:

-pandas (data manipulation) -numpy (numerical computations) -scanpy (single-cell analysis) -anndata (annotated data) -matplotlib & seaborn (visualization) -scikit-learn (machine learning) -umap-learn (dimensional reduction) -scipy (scientific computing) -gprofiler-official (functional enrichment)

Install all dependencies:

pip install -r requirements.txt

Pipeline Components

  1. Data Processing Module
pythonCopyfrom src.preprocessing.data_processor import GastricCancerProcessor
processor = GastricCancerProcessor('data/raw/counts_matrix.csv')
processor.preprocess()
processor.normalize()
processor.filter_cells()
  1. Dimensional Reduction
pythonCopyfrom src.analysis.dimension_reducer import DimensionReducer
reducer = DimensionReducer(processed_data)
reducer.run_pca()
reducer.run_umap()
reducer.run_tsne()
  1. Clustering Analysis
pythonCopyfrom src.analysis.cluster_analyzer import ClusterAnalyzer
analyzer = ClusterAnalyzer(reduced_data)
analyzer.perform_clustering(methods=['leiden', 'louvain', 'kmeans'])
analyzer.evaluate_clusters()
  1. Biological Interpretation
pythonCopyfrom src.analysis.gene_analyzer import GeneAnalyzer
gene_analyzer = GeneAnalyzer(clustered_data)
gene_analyzer.find_markers()
gene_analyzer.perform_go_analysis()

Documentation

Detailed tutorials and examples can be found in the 'analysis' directory Pipeline results are stored in the 'results' directory Documentation for each module is available in the 'documentation' directory

Project Structure

scRNAseq-Pipeline-GastricCancer/
├── analysis/                # Analysis notebooks and scripts
├── data/                   # Data directory
│   ├── raw/               # Raw count matrices and metadata
│   └── processed/         # Processed and normalized data
├── results/               # Analysis results
│   ├── figures/          # Generated visualizations
│   └── tables/           # Statistical outputs and gene lists
├── src/                   # Source code
│   ├── preprocessing/    # Data preprocessing modules
│   ├── analysis/        # Core analysis functions
│   └── visualization/   # Plotting and visualization tools
└── documentation/         # Project documentation and reports

Contributing

Contributions to improve the pipeline are welcome. Please read the contribution guidelines before submitting pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Based on scRNA-seq data from GEO (GSE158631) Inspired by current methodologies in single-cell analysis Developed using open-source bioinformatics tools

Author

Bulut Hamali

Contact

For questions or collaboration opportunities, please open an issue or contact through GitHub.

About

Reconstruction and augmentation of the Wang B. et al. (2021) paper's computational framework for dissecting metastatic gastric cancer through their provided single-cell transcriptomics dataset.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published