Transaction Fraud Detection System

This project implements an end-to-end machine learning pipeline to detect fraudulent financial transactions using classification models and domain-driven feature engineering. The goal is to maximize fraud detection accuracy while maintaining business-friendly false positive rates.

Project Overview

Financial fraud causes significant operational and monetary losses. This project focuses on building a scalable fraud detection system using historical transaction data and machine learning techniques.

Key objectives:

Identify fraudulent transactions accurately
Handle extreme class imbalance
Optimize decision thresholds for business use
Interpret important fraud-driving features

Dataset Information

Dataset: Financial Transaction Fraud Dataset
Total Records: 1M+ transactions
Target Variable: isFraud
Class Distribution: Highly imbalanced

Note: The dataset file is not included in this repository due to GitHub file size limitations. Please download the dataset separately and place it inside the data folder.

Technologies Used

Python
Pandas and NumPy
Scikit-learn
Matplotlib and Seaborn
Jupyter Notebook

Feature Engineering

The following domain-driven features were engineered:

Sender balance inconsistency
Receiver balance inconsistency

These features help capture abnormal transaction behavior commonly associated with fraud.

Models Implemented

Two machine learning models were implemented:

Logistic Regression (Baseline)
Random Forest Classifier (Final Model)

Class imbalance was handled using cost-sensitive learning and probability threshold tuning.

Model Evaluation

The models were evaluated using:

Confusion Matrix
ROC-AUC Score
ROC Curve
Precision-Recall Curve
Fraud Probability Distribution
Threshold Optimization

How To Run The Project

Clone the repository
Install dependencies using requirements.txt
Launch Jupyter Notebook
Open fraud_detection_analysis.ipynb
Run all cells

Results Summary

Achieved high ROC-AUC performance
Improved fraud recall using threshold tuning
Identified important fraud-driving features
Built a production-ready fraud detection workflow

Author

Samarveer Sah
Machine Learning and Data Science Enthusiast

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transaction Fraud Detection System

Project Overview

Dataset Information

Technologies Used

Feature Engineering

Models Implemented

Model Evaluation

How To Run The Project

Results Summary

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transaction Fraud Detection System

Project Overview

Dataset Information

Technologies Used

Feature Engineering

Models Implemented

Model Evaluation

How To Run The Project

Results Summary

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages