Data Analytics Credit Prediction

Project Overview

This project focuses on predicting the balance of various bank users based on their attributes. The dataset comprises user attributes and their corresponding balances. The objective is to build a predictive model that can accurately estimate the balance for each user. The model's performance is evaluated using Mean Absolute Error (MAE).

Files

notebook.ipynb: The Jupyter Notebook containing the entire process of data analysis, preprocessing, model development, and evaluation.
credit_train.csv: The training dataset containing user attributes and the target balance.
credit_test.csv: The test dataset for which predictions need to be made. It contains user attributes without the target balance.
credit_test_sample.csv: A sample submission file showing the format in which the predictions should be submitted.
requirements.txt: A file listing all the Python packages required to run the notebook.

Requirements

To run the Jupyter Notebook, you'll need to install the necessary Python packages listed in requirements.txt. You can install them using pip:

pip install -r requirements.txt

How to Run the Project

Clone the repository:

git clone https://github.com/suleimanelkhoury/data-analytics-credit.git
cd data-analytics-credit

Install the required packages:
```
pip install -r requirements.txt
```
Run the Jupyter Notebook: Open the notebook.ipynb file in Jupyter Notebook or Jupyter Lab and run the cells step by step to:
- Load and explore the data.
- Preprocess the data (handling missing values, feature encoding, etc.).
- Train various machine learning models.
- Evaluate the models using MAE.
- Make predictions on the test set.
Generate Submission:
- After running the notebook, predictions for the test set will be saved in the format of credit_test_sample.csv.
- Modify and save the results as a .csv file to submit to the Kaggle competition.

Data

The data is provided in CSV format and consists of user attributes such as age, income, and others. The training dataset (credit_train.csv) includes the balance, which is the target variable.

Model Evaluation

The performance of the models is evaluated using the Mean Absolute Error (MAE). MAE measures the average magnitude of the errors in a set of predictions, without considering their direction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analytics Credit Prediction

Project Overview

Files

Requirements

How to Run the Project

Data

Model Evaluation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
credit_test.csv		credit_test.csv
credit_test_sample.csv		credit_test_sample.csv
credit_train.csv		credit_train.csv
notebook.ipynb		notebook.ipynb
requirements.txt		requirements.txt

suleimanelkhoury/data-analytics-credit

Folders and files

Latest commit

History

Repository files navigation

Data Analytics Credit Prediction

Project Overview

Files

Requirements

How to Run the Project

Data

Model Evaluation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages