Skip to content

shubhdevelops/HIV_Protease_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🧬 HIV Protease Sequence Analysis

This project focuses on analyzing HIV-1 protease sequences to study variability, conservation, and drug resistance mutations (DRMs).
It was developed as part of my bioinformatics + machine learning learning journey (B.Tech Biotechnology @ NIT Raipur).


Features

  • Multiple Sequence Alignment (MSA) of HIV protease sequences
  • Identification of conserved vs variable positions
  • Overlay of known Drug Resistance Mutations (DRMs)
  • Visualization with sequence logos and conservation plots
  • Mutation frequency table generation for key positions

Project Workflow

1️⃣ Data Preparation

  • Collected HIV-1 protease sequences (FASTA format).
  • Cleaned and standardized dataset.

2️⃣ Multiple Sequence Alignment

  • Performed MSA using Biopython + Clustal/MAFFT.
  • Exported alignment in FASTA and Clustal format.

3️⃣ Conservation Analysis

  • Calculated per-position conservation scores.
  • Plotted conservation heatmaps & sequence logos.


4️⃣ Drug Resistance Mapping

  • Mapped known Drug Resistance Mutations (DRMs) from Stanford HIVDB.
  • Highlighted variable sites overlapping with DRMs.


5️⃣ Mutation Frequency Table

  • Generated frequency table of amino acid substitutions at key residues.


📊 Results

  • Identified highly conserved catalytic motifs.
  • Highlighted hotspot residues linked to drug resistance.
  • Visualizations provide insight into mutation patterns in HIV protease.

🛠️ Tech Stack

  • Python (Biopython, Pandas, Matplotlib, Seaborn)
  • MSA tools (MAFFT/Clustal Omega)

Future Scope

  • Expand to machine learning models for predicting drug resistance.
  • Incorporate phylogenetic tree analysis for evolutionary tracking.
  • Automate into a lightweight web dashboard (MERN + ML).

📌 About

👤 Shubham Thakur
B.Tech Biotechnology, NIT Raipur
Exploring Bioinformatics | AI in Healthcare | Full-Stack Development


This project showcases my bioinformatics + coding skills, and is intended for research learning purposes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages