Fraud Warden

Overview

Fraud Warden is a next-generation credit card fraud detection system that uses machine learning to predict whether a transaction is fraudulent or not. The system leverages a Random Forest Classifier to make predictions based on various features of the transaction.
Technology Stack Programming Language: Python

Libraries:

streamlit for building the web application
pandas for data manipulation
plotly.express and seaborn for data visualization
scikit-learn for machine learning
pickle for model serialization

Installation Instructions

Clone the Repository:
git clone https://github.com/yourusername/fraud-warden.git
cd fraud-warden
Create a Virtual Environment:
- python -m venv venv
- source venv/bin/activate # On Windows use venv\Scripts\activate
Install Dependencies:
- pip install -r requirements.txt
Run the Application:
- streamlit run app.py

How It Works

Data Preprocessing:
- The application preprocesses the uploaded CSV file by removing unnecessary columns and converting date columns to datetime objects. Additional features such as time_of_day and age are derived from existing columns.
Feature Engineering:
- Categorical features are encoded into numerical values. The data is reindexed to ensure all required columns are present.
Oversampling:
- The application uses Synthetic Minority Over-sampling Technique (SMOTE) to balance the dataset.
Model Prediction:
- The preprocessed data is fed into a pre-trained Random Forest Classifier model. The model predicts whether a transaction is fraudulent based on the input features.
Visualization:
- The application provides various visualizations such as histograms, bar charts, and correlation heatmaps to help users understand the data.

Features

Upload CSV: Users can upload a CSV file containing transaction data.
Data Preview: Displays a preview of the uploaded data.
Basic Statistics: Shows basic statistics of the dataset.
Data Types: Displays the data types of each column.
Missing Values: Shows the count of missing values in each column.
Distribution of Numerical Columns: Visualizes the distribution of numerical columns.
Counts of Categorical Columns: Visualizes the counts of categorical columns.
Correlation Heatmap: Displays a heatmap of the correlation between numerical features.
SMOTE Sampling: Balances the dataset using SMOTE sampling.
Fraud Prediction: Predicts whether a transaction is fraudulent based on user input.

Resources Used

Dataset: Credit Card Fraud Detection Dataset (Kaggle)
Sklearn Documentation: Random Forest Classifier
Streamlit Documentation: Streamlit
Plotly Documentation: Plotly Express
Seaborn Documentation: Seaborn
Pandas Documentation: Pandas
Python Documentation: Python
SMOTE Documentation: SMOTE

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
.gitignore		.gitignore
FraudWarden-Model.ipynb		FraudWarden-Model.ipynb
RF-Optimized.pickle		RF-Optimized.pickle
Readme.md		Readme.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fraud Warden

Overview

Libraries:

Installation Instructions

How It Works

Features

Resources Used

About

Languages

AntarMukhopadhyaya/Fraud-Warden

Folders and files

Latest commit

History

Repository files navigation

Fraud Warden

Overview

Libraries:

Installation Instructions

How It Works

Features

Resources Used

About

Topics

Resources

Stars

Watchers

Forks

Languages