Ethereum, one of the most prominent blockchain platforms, supports financial transactions and facilitates the execution of smart contracts. These contracts are programmable, self-executing, Turing-complete codes that enable many decentralized applications (DApps). However, the versatility and popularity of Ethereum have also made it a frequent target for malicious activities. The network has been plagued with scams, exacerbated by its unsupervised network organization and anonymous participation. These malicious activities have resulted in significant financial losses and undermined the confidence of investors and users in the integrity and reliability of the Ethereum platform.
This repository implements different machine-learning algorithms to detect malicious Ethereum transactions. The machine learning algorithms are trained on Ethereum Transaction Fraud Data (ETFD), a labeled dataset for binary classification. ETFD Dataset is a comprehensive and high-quality dataset designed to facilitate research and development in the domain of fraud transaction detection within the Ethereum blockchain. Generated by the Ethereum Transaction Data Generator (ETDG), the ETFD dataset addresses common challenges in public Ethereum fraud detection datasets, such as single cardinality, high cardinality, missing values, and data encoding issues, thereby reducing the risk of model overfitting and enhancing model performance. The dataset contains 85,003 transaction data,14 features, and binary class labels. 42,499 transactions belong to the ’No Fraud’ class, and 42,504 belong to the ’Fraud’ class. The details on ETDG and ETFD can be found at: https://github.com/Huned-materwala/Ethereum-Transaction-Data-Generator-ETDG.