Project for customer management in the Marketing Analytics Department of a large retail bank. The aim of this project is to know which marketing activity effectively retains customers. We have information about individual customer profitability (CLV) and a survey was conducted as well. A research model explaining/predicting individual customer profitability is expected, along with a theoretical rational for these hypotheses and test the hypotheses. Multiple independent variables very tried to come up with some meaningful conclusions.
This report summarises the findings of the Customer Management project within the Marketing Analytics Department of our bank which holds a prominent position in the spectrum of retail banking. The major objective of this project is to find out which marketing activity retains the customers with the bank, and generates high profitability. For this a wide set of data was taken and analysed which included,
(i) Personal details about customers; such as gender, profession, income, education etc.
(ii) Customer-Bank relationship details; such as percentage of loans & savings with our bank, branch number, relationship duration etc.
(iii) Results of a specially drafted survey with questions relating to customer’s satisfaction, loyalty and value.
(iv) Customer Lifetime Value for a set of customers.
This data helped in doing a descriptive analysis and find out which customer attributes are linked to high profitability over a lifetime. Apart from this a predictive analysis is also done to find out the overall profitability a customer would bring in based on the other factors, with certain level of confidence. This helps in zeroing on the features which drive CLV the most and the model which helps in predicting the CLV assists in planning our interactions with the customers.
The following report describes the various data analysis models used and the reasons why they were chosen and also compares the results obtained from them and finally makes recommendations with the best suited model.
The models are built using a bottom-up approach, starting from the single variable, and then adding complexities to it, so as to enhance the predictive accuracy while maintaining the significance of variables.
Also, the models are designed on approximately 70% of the data-points and validated on the remaining 30%, this helps us in proving the robustness of the model, and its applicability to new-unseen data. These scales are chosen as per conventional prevalence. Choice of this split was empirical.