This repository hosts a collection of Jupyter Notebooks
in both English and Spanish, dedicated to performing data quality analysis using the R programming language
. A detailed analysis structure is provided, enabling thorough inspection and enhanced understanding of an example dataset. It's important to note that the dataset used is for illustrative purposes only and its practical relevance is limited; it is included solely to demonstrate data analysis methodologies and techniques. Additionally, the repository contains HTML
and PDF
versions of the notebooks and the resulting images of the graphics.
- Data Treatment
- Importing Libraries
- Loading Dataset
- Analysis
- Data Overview
- Detection of Duplicate Records
- Detection of Missing Values
- Detection of Atypical Values
- Some Statistical Calculations...
The Jupyter Notebooks
in this repository can be viewed directly on GitHub, which allows for easy review of the analysis and outcomes without the need for local execution. For an interactive experience or to modify the analysis, it is recommended to clone the repository and work with the notebooks locally.
If you wish to execute or edit the notebooks on your own machine, ensure you have an R distribution installed, along with the packages mentioned in the notebooks. Jupyter users will need to install IRKernel
to enable the execution of R
within this environment.
If you would like to contribute to the project, please fork