GitHub - ade-wagimon/Protein-Function-Prediction-Using-Convolutional-Neural-Networks

Protein Function Prediction Using Convolutional Neural Networks

This repository contains code for predicting the function of proteins using Convolutional Neural Networks (CNNs) implemented in TensorFlow/Keras. The model utilizes protein sequence data and various biochemical properties for classification.

Dataset and Requirements

Dataset: The sequences of two enzymes (CrtB and CrtM) are aligned using Clustal Omega and stored in a file (Align.aln).
Biochemical Properties: Amino acid sequences are encoded using one-hot encoding along with additional features such as hydrophobicity, molecular weight, charge, polarity, aromaticity, and acidity/basicity.
Labels: Sequences are labeled based on their function (Function1 or Function2).

Installation

Clone the repository:

git clone https://github.com/ade-wagimon/Protein-Function-Prediction-Using-Convolutional-Neural-Networks.git
cd repository

Install dependencies:

pip install tensorflow matplotlib seaborn numpy biopython scikit-learn

Usage

Data Preparation:
- Update file paths (CrtB.fasta, crtM.fasta, Align.aln) according to your dataset.
Model Training:
- Run the script to preprocess data, build the CNN model, train, and evaluate:
```
python train_model.py
```
Evaluation:
- Evaluate the model performance using accuracy, precision, recall, F1-score, classification report, and confusion matrix.

Code Structure

train_model.py: Main script to preprocess data, build the CNN model, train, and evaluate.
utils.py: Utility functions for reading FASTA files, encoding sequences, and assigning labels.
requirements.txt: List of Python dependencies.

Results

After training, the model's performance metrics are displayed, including accuracy on the test set, classification report, and confusion matrix. Additionally, training history plots showing accuracy and loss over epochs are generated.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

This project utilizes the TensorFlow/Keras framework and various Python libraries for data processing and visualization.
Special thanks to the developers of Clustal Omega for sequence alignment.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Protein Function Prediction Using Convolutional Neural Networks

Dataset and Requirements

Installation

Usage

Code Structure

Results

License

Acknowledgments

About

Releases

Packages

Languages

ade-wagimon/Protein-Function-Prediction-Using-Convolutional-Neural-Networks

Folders and files

Latest commit

History

Repository files navigation

Protein Function Prediction Using Convolutional Neural Networks

Dataset and Requirements

Installation

Usage

Code Structure

Results

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages