Multimodal Phishing Detector

Multimodal Phishing Detector

Advanced phishing detection using screenshots, message text, and URLs — powered by BERT, DistilBERT, ResNet50 & OCR

Overview

A comprehensive multimodal phishing detection system that combines:

Text-based detection using fine-tuned BERT model
URL threat classification with DistilBERT multiclass classification
Image-based phishing detection leveraging ResNet50 with optional OCR
Fusion logic to combine multimodal signals
FastAPI backend with React frontend
Chat mode powered by a constrained LLM for safe explanations

Complete pipeline from training to evaluation to inference with production-ready UI.

Model Performance

Evaluation results from comprehensive testing:

Modality	Model	Accuracy	Precision	Recall	F1 Score
Image	ResNet50 (25 epochs, best val: 83.30%)	78.99%	72.69%	82.63%	77.34%
Text	Fine-tuned BERT	97.22%	97.51%	95.84%	96.67%
URL	DistilBERT (3-class)	99.50%	99.49%	99.49%	99.49%

Key Metrics:

Image model best validation accuracy: 83.30%
Text model ROC-AUC: 0.9961
URL model overall accuracy: 99.50%

Detailed URL Classification Results

Test Dataset: 22,182 samples
Overall Accuracy: 99.50%
Macro F1: 0.9949
Weighted F1: 0.9950

Per-Class Performance

Class	Precision	Recall	F1 Score	Support
Benign	99.64%	100.00%	99.82%	7,500
Phishing	99.18%	99.43%	99.30%	7,182
Malware	99.65%	99.05%	99.35%	7,500

Generated Visualizations:

runs/url_eval/roc_curves.png
runs/url_eval/confusion_matrix.png
runs/url_eval/confidence_histogram.png

Tech Stack

Backend & ML

FastAPI - High-performance async API framework
PyTorch - Deep learning framework
Torchvision - Computer vision models (ResNet50)
Transformers (HuggingFace) - BERT and DistilBERT implementations
Scikit-learn - ML utilities and metrics
Pandas & NumPy - Data processing
Pytesseract - OCR capabilities (optional)
Uvicorn - ASGI server

Frontend

React (Vite) - Modern frontend framework
Tailwind CSS - Utility-first styling
Shadcn UI - Component library
Framer Motion - Animation library

Project Structure

Backend Components

backend/
├── app/
│   ├── main.py              # FastAPI routes and endpoints
│   ├── models_loader.py     # Model initialization and loading
│   ├── predictors.py        # Text/URL/Image inference logic
│   ├── intent.py            # Multimodal intent detection
│   └── llm_helpers.py       # LLM-based explanations

Key Modules

main.py - API routes for inference and chat
models_loader.py - Loads BERT, DistilBERT, and ResNet50 models
predictors.py - Handles predictions for each modality
intent.py - Combines multimodal signals for final detection
llm_helpers.py - Generates safe, constrained explanations

Datasets

Image Data

CIRCL phishing dataset
OpenPhish screenshot collection
Kaggle phishing screenshot datasets

Text Data

HuggingFace phishing datasets
Custom phishing message collection

URL Data

Tranco - Benign URLs
PhishTank - Phishing URLs
URLHaus - Malware URLs

All datasets used within their respective license constraints.

Setup

Environment Configuration

Copy the example environment file:

copy .env.example .env

Configure the following variables:

MM_TEXT_MODEL_DIR=path/to/text/model
MM_URL_MODEL_DIR=path/to/url/model
MM_IMAGE_MODEL_PATH=path/to/image/model
MM_USE_OCR=true
MM_TESSERACT_CMD=path/to/tesseract
# Add LLM provider keys if using chat features

Backend Setup

# Activate virtual environment
.\phishingenv\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

# Start server
cd backend
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000

Frontend Setup

cd frontend
npm install
npm run dev

The application will be available at:

Backend API: http://127.0.0.1:8000
Frontend: http://localhost:5173

Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/improvement)
Commit your changes (git commit -am 'Add new feature')
Push to the branch (git push origin feature/improvement)
Open a Pull Request

Important:

Do not commit model weights or datasets
Follow code formatting standards (black, isort for Python)
Add tests for new features
Update documentation as needed

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with care 💌 by Akash

If you find this project helpful, please consider giving it a star ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
backend/app		backend/app
frontend		frontend
ml		ml
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Phishing Detector

Overview

Model Performance

Detailed URL Classification Results

Per-Class Performance

Tech Stack

Backend & ML

Frontend

Project Structure

Backend Components

Key Modules

Datasets

Image Data

Text Data

URL Data

Setup

Environment Configuration

Backend Setup

Frontend Setup

Contributing

License

About

Uh oh!

Releases 1

Packages

Languages

License

XynaxDev/multimodal-phishing-detector

Folders and files

Latest commit

History

Repository files navigation

Multimodal Phishing Detector

Overview

Model Performance

Detailed URL Classification Results

Per-Class Performance

Tech Stack

Backend & ML

Frontend

Project Structure

Backend Components

Key Modules

Datasets

Image Data

Text Data

URL Data

Setup

Environment Configuration

Backend Setup

Frontend Setup

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages