Statistics for data science - explorative data analysis for univariates, bivariates and multivariate datasets
Financial Inclusion remains one of the main obstacles to economic and human development in Africa. For example, across Kenya, Rwanda, Tanzania, and Uganda only 9.1 million adults (or 13.9% of the adult population) have access to or use a commercial bank account.
Traditionally, access to bank accounts has been regarded as an indicator of financial inclusion. Despite the proliferation of mobile money in Africa and the growth of innovative fintech solutions, banks still play a pivotal role in facilitating access to financial services. Access to bank accounts enables households to save and facilitate payments while also helping businesses build up their credit-worthiness and improve their access to other financial services. Therefore, access to bank accounts is an essential contributor to long-term economic growth.
The research problem is to figure out how we can predict which individuals are most likely to have or use a bank account. Your solution will help provide an indication of the state of financial inclusion in Kenya, Rwanda, Tanzania, and Uganda, while providing insights into some of the key demographic factors that might drive individuals’ financial outcomes.
In order to work on the above problem, you need to do the following:
Define the question, the metric for success, the context, experimental design taken and the appropriateness of the available data to answer the given question
Find and deal with outliers, anomalies, and missing data within the dataset.
Perform univariate, bivariate and multivariate analysis recording your observations.
Implement the solution by performing the respective analysis i.e. factor analysis, principal component analysis, and discriminant analysis.
Challenge your solution by providing insights on how you can make improvements.
NB: Remember to go through the rubric [https://moringaschool.instructure.com/courses/274/assignments/2068] to get an understanding of how you will be graded.
The main dataset contains demographic information and what financial services are used by individuals across East Africa. This data was extracted from various Finscope surveys ranging from 2016 to 2018, and more information about these surveys can be found here:
FinAccess Kenya 2018. [https://fsdkenya.org/publication/finaccess2019/] (Links to an external site.) Finscope Rwanda 2016. [http://www.statistics.gov.rw/publication/finscope-rwanda-2016] (Links to an external site.) Finscope Tanzania 2017. [http://www.fsdt.or.tz/finscope/] (Links to an external site.) Finscope Uganda 2018. [http://fsduganda.or.ug/finscope-2018-survey-report/] (Links to an external site.)
Variable Definitions: http://bit.ly/VariableDefinitions (Links to an external site.) Dataset: http://bit.ly/FinancialDataset (Links to an external site.)