Exploratory Data Analysis or EDA is used to take insights from the data.
The Dataset that was given to us is based on a synthesised transaction dataset containing 3 months’ worth of transactions for 100 hypothetical customers. It contains purchases, recurring transactions, and salary transactions.
The dataset is designed to simulate realistic transaction behaviours that are observed in ANZ’s real transaction data, so many of the insights we will gather will be genuine.
status : denotes the status of the transaction posted or authorized for transaction.
card_present_flag : Did the customer have a card during the transaction (1.1 = Yes or 0.0 = No).
bpay_biller_code : unique code of the BPay Transaction done by the customer.
account : account number of the customers who made transaction.
currency : currency type in which the transaction has been done (AUD dollars).
long_lat : Longitude and Latitude location of the customer.
txn_description : the mode of transaction the customer has done.
merchant_id : the merchant id where the customers have done their transaction.
merChant_code : unique merchant code for each customer.
first_name : first name of the customer.
balance : balance the customer had during the transaction of period 3 months.
date : date when the transaction took place.
gender : gender of the customer(Male or Female).
age : age of the customer.
merchant_suburb : the district or city where the merchant is located.
merchant_state : the state where the merchant is located.
extraction : extraction of the transaction data.
amount : the amount transacted by the customer.
transaction_id : unique transaction id given by the merchant when the customer makes an transaction.
country : country where the customer's are located (Australia).
customer_id = id for the customer's to differentiate them as unique.
merchant_long_lat : the latitude and longitude location of the merchant.
movement : mode of transaction (credit or debit).
#Task 1:Exploratory Data Analysis
The main purpose of EDA is to detect any errors, outliers as well as to understand different patterns in the data. It allows Analysts to understand the data better before making any assumptions.
Goal: Excecute data segmentation and visualization. Update: Visualization of datapoints done with Matplotlib, Seaborn, Plotly and WordCloud
Goal: Build a regression and decision-tree prediction model. Update: Built for supervised and unsupervised models
My objective for this "virtual internship" project is to build career skills, experience of data analysis skills, and improve my insight gathering abilities.