Skip to content

NBChub/strainmap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

strainmap

Map of the G1034 NPGM actinomycete strains published in NAR.

Open notebook here.

Overview

This repository contains data, analysis notebooks, and results for mapping NPGM isolates/strains. It includes processed data, figures generated from the analysis, and the Jupyter notebook used for the analysis.

Important Note: Two genomes are dropped because of inconsistent exon ordering for features: NBC_01310 (NBC_0131000000000_213589.current.gb) and NBC_01080 (NBC_0108000000000_76298.current.gb)

Repository Structure

  • data/: Contains processed data used in the analysis.
    • processed/G1032_20240208/: BGCFlow run result of the dataset.
      • metadata/: Contains JSON files with project metadata and dependency versions.
      • tables/: CSV files with summaries from antiSMASH and GTDB metadata.
  • figures/: Contains figures generated from the analysis.
    • strainmap_G1032_20240208.html: Interactive map of strains.
  • notebook/: Jupyter notebooks for analysis.
    • strainmap_G1034.ipynb: Notebook with the analysis code.
  • README.md: This file, providing an overview of the repository.
  • strainmap_env.yaml: Conda environment file to reproduce the analysis environment.
  • tables/: Cleaned table with latitude and longitude data.
    • df_antismash_7.1.0_summary_with_gps.csv: Summary table with GPS coordinates.

Getting Started

To replicate the analysis environment, use the provided strainmap_env.yaml file with Conda:

conda env create -f strainmap_env.yaml

Activate the environment:

conda activate strainmap_env

You can then open the Jupyter notebook in the notebook/ directory to explore the analysis.

Data Description

The data/ directory contains processed datasets from BGCFlow run used in the analysis. Metadata files describe the project context and software dependencies. The tables provide summaries of genomic features and metadata relevant to the NPGM isolates/strains.

Figures

The figures/ directory includes interactive maps generated from the analysis, allowing for a visual exploration of the strain distributions.

Notebooks

The notebook/ directory contains Jupyter notebooks that detail the analysis process, from data processing to visualization.

Conda Environment

The strainmap_env.yaml file specifies the Conda environment necessary to run the analysis, ensuring reproducibility.

Tables

Processed tables, including summaries with GPS coordinates, are provided in the tables/ directory for further analysis and reference.

About

Map of the G1034 NPGM actinomycete strains published in NAR (https://doi.org/10.1093/nar/gkae523).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published