The source of the dataset used in this repository is available on kaggle website and can be found here I performed preprocessing and data cleaning on the dataset.
The code also includes K-nearest neighbors algorithm from scratch and sklearn library, then the algorithms are evaluated and compared using a method named "Average Error".