Ethereum Address Poisoning Detection Model

An end-to-end machine learning pipeline to detect "Address Poisoning" attacks on the Ethereum blockchain.

Overview

Address poisoning is a deceptive tactic where attackers send small or zero-value transactions from addresses that mimic a user's recent counterparties (often by matching the first and last few characters). The goal is to "poison" the user's transaction history so that they might accidentally copy the attacker's address for future transfers.

This project provides tools to:

Extract Data: Query a MySQL Ethereum database to collect transaction metadata.
Engineer Features: Calculate metrics like counterparty frequency and transaction bursts.
Train & Detect: Utilize a Support Vector Machine (SVM) model to classify addresses as malicious or benign.

Tech Stack

Language: Python 3.7
Database: MySQL (Ethereum blockchain data)
Libraries: Pandas, Scikit-learn, Matplotlib, Seaborn
Environment: Pipenv, Jupyter Notebooks

Project Structure

scripts/: Python and Bash scripts for data collection.
- gather_addresses_metadata.py: The primary data extraction engine.
- start_dataset_generation.sh: Wrapper for the extraction process.
address_poisoning_dataset.ipynb: Notebook for data exploration and preprocessing.
address_poisining_model.ipynb: Notebook for model training (SVC) and evaluation.
docs/: Visual documentation and diagrams.
dataset/: (Required) Folder for input/output CSV data.

Getting Started

1. Prerequisites

Python 3.7 and Pipenv.
Access to an Ethereum MySQL database.
Create a dataset/ directory in the root.
Environment Variables: Copy .env.example to .env and update with your database credentials.
```
cp .env.example .env
```

2. Installation

# Install dependencies
pipenv install

# Enter the virtual environment
pipenv shell

3. Data Collection

Update dataset/address_poisoning_addresses_list.csv with the target phishing addresses, then run:

bash scripts/start_dataset_generation.sh

This will generate address_poisoning_transactions.csv and use address_poisoning_transactions_checkpoint.txt to track progress.

4. Model Training & Analysis

Launch Jupyter and open the notebooks:

jupyter notebook

Run address_poisoning_dataset.ipynb to analyze the raw transaction data.
Run address_poisining_model.ipynb to train the classifier and visualize detection performance.

Model Features

The classifier relies on several engineered features:

is_repeat_counterparty: Identifies if a transaction pair has been seen before.
counterparty_tx_count: The total number of interactions between two addresses.
burst_flag: Detects rapid-fire transactions within a short time threshold (5 minutes).

Security

Database credentials are managed via environment variables using python-dotenv. A template is provided in .env.example. Never commit your .env file to version control.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
logs		logs
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
GEMINI.md		GEMINI.md
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.MD		README.MD
address_poisining_model.ipynb		address_poisining_model.ipynb
address_poisoning.csv		address_poisoning.csv
address_poisoning_dataset.ipynb		address_poisoning_dataset.ipynb
gather_addresses_metadata.py		gather_addresses_metadata.py
start_dataset_generation.sh		start_dataset_generation.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ethereum Address Poisoning Detection Model

Overview

Tech Stack

Project Structure

Getting Started

1. Prerequisites

2. Installation

3. Data Collection

4. Model Training & Analysis

Model Features

Security

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ethereum Address Poisoning Detection Model

Overview

Tech Stack

Project Structure

Getting Started

1. Prerequisites

2. Installation

3. Data Collection

4. Model Training & Analysis

Model Features

Security

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages