Skip to content

ZhiGroup/UDIP-FA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

UDIP-FA: Unsupervised Deep Representation Learning of Fractional Anisotropy Maps

DOI Python R License

This repository contains the complete analysis pipeline for the study "Unveiling genetic architecture of white matter microstructure through unsupervised deep representation learning of fractional anisotropy maps".

Figure 1_page-0001

πŸ“‹ Table of Contents

πŸ”¬ Overview

This study introduces UDIP-FA (Unsupervised Deep Image Phenotyping of Fractional Anisotropy), a novel deep learning approach for analyzing white matter microstructure in brain imaging data. The pipeline includes:

  • Deep representation learning of FA maps using customized 3D AutoEncoders.
  • Genome-wide association studies (GWAS) on learned endophenotypes.
  • Polygenic risk score (PRS) associations with brain disorders.
  • Network-based drug targeting analysis.

πŸ›  Installation

Prerequisites

  • Python 3.8 or higher
  • R 4.0 or higher
  • Git

Python Dependencies

We recommend using a virtual environment (conda or venv).

# Create and activate environment
conda create -n udip-fa python=3.8
conda activate udip-fa

# Install dependencies from requirements.txt
pip install -r requirements.txt

Note: Ensure you have a compatible PyTorch version for your CUDA driver installed.

R Dependencies

install.packages(c("ggplot2", "dplyr", "tidyr", "data.table", 
                   "ComplexHeatmap", "circlize", "RColorBrewer",
                   "cowplot", "ggpubr", "pheatmap"))

# Bioconductor packages
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("clusterProfiler", "org.Hs.eg.db", "DOSE"))

πŸ“ Repository Structure

UDIP-FA/
β”œβ”€β”€ Model/                     # Deep Learning Model & Scripts
β”‚   β”œβ”€β”€ model.py               # AutoEncoder Architecture (PyTorch)
β”‚   β”œβ”€β”€ dataset.py             # Dataset Loading Logic
β”‚   β”œβ”€β”€ Train.py               # Training Script (PyTorch Lightning)
β”‚   β”œβ”€β”€ inference.py           # Inference Script for generating embeddings
β”‚   └── model_compare.py       # Analysis & Visualization scripts
β”œβ”€β”€ FA_GWAS_all.ipynb          # Main GWAS Analysis Notebook
β”œβ”€β”€ FA_all.R                   # Post-GWAS Analysis (R)
β”œβ”€β”€ FA_network_drug_analysis.R # Network & Drug Analysis (R)
β”œβ”€β”€ requirements.txt           # Python Project Dependencies
└── README.md                  # Project Documentation

🧠 UDIP-FA Model Usage

The deep learning model is located in the Model/ directory.

Data Preparation

Input data should be Affine registered MRI images (NIfTI format). Prepare a CSV file containing the paths to your images under a column named mri_names (or specify your column name during inference).

Training

To train the AutoEncoder from scratch:

python Model/Train.py

Note: Model/Train.py is configured to use PyTorch Lightning. Adjust hyperparameters (learning rate, batch size, GPUs) directly in the file or by modifying the LitAutoEncoder class.

Inference

To generate latent representation (endophenotypes) from trained models:

python Model/inference.py --input_csv /path/to/data.csv \
                          --checkpoint /path/to/model.ckpt \
                          --output_dir /path/to/results

Common Arguments:

  • --input_csv: Path to CSV file with image paths.
  • --checkpoint: Path to the .ckpt model file.
  • --output_dir: Folder to save the output pickle files.
  • --device: cuda:0 or cpu.

Analysis

For performing analysis on significant SNPs and feature correlations:

python Model/model_compare.py

This script includes functions to:

  1. Plot significant SNPs across different thresholds.
  2. Compute and visualize pairwise correlations (CCA, Pearson) between feature sets.

🧬 GWAS & Post-Analysis

The repository includes comprehensive scripts for the genetic analysis stages:

FA_GWAS_all.ipynb

This Jupyter notebook serves as the main entry point for the genetic analysis, covering:

  • UDIP-FA feature association analyses: Correlating deep learning features with genetic variants.
  • Polygenic Risk Score (PRS) associations: Investigating links between learned features and brain disorders.
  • Model Explainability: Interpretability assessments of the autoencoder features.
  • Comparative Analysis: Benchmarking against previous white matter studies.

FA_all.R

R script dedicated to post-GWAS statistical processing:

  • Result Aggregation: Filtering and summarizing GWAS statistics.
  • Figure Generation: Producing publication-ready plots (Manhattan plots, QQ plots).
  • Meta-analysis: Effect size calculations and statistical validation.

FA_network_drug_analysis.R

Advanced network analysis for biological insights:

  • Gene-Drug Interaction: Constructing networks to identify potential drug targets.
  • Therapeutic Targets: Highlighting genes actionable by existing drugs.
  • Mechanism of Action: Pathway analysis to understand underlying biological mechanisms.

πŸ”„ Reproducibility

Pre-trained Models

The pretrained model can be accessed at this Google Drive Link.

Random Seeds

  • Python: np.random.seed(42)
  • R: set.seed(42)

πŸ“š Citation

If you use this code in your research, please cite:

@article{zhao2025udip,
  title={Unveiling genetic architecture of white matter microstructure through unsupervised deep representation learning of fractional anisotropy maps},
  author={Zhao, Xingzhong and Xie, Ziqian and He, Wei and Fornage, Myriam and Zhi, Degui},
  journal={medRxiv},
  year={2025},
  doi={10.1101/2025.07.04.25330856}
}

πŸ’¬ Contact


Keywords: white matter, fractional anisotropy, deep learning, GWAS, neuroimaging, brain imaging, genetics, biomarker

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •