This repository is containing portfolio of data science and data analyst projects completed by me (and sometimes with my team) for academic, self learning, and hobby purposes. Presented in the form of iPython Notebooks.
-
Data Analysis and Visualization [Kaggle.com]
This dataset includes 721 Pokemon, including their number, name, first and second type, and basic stats: HP, Attack, Defense, Special Attack, Special Defense, and Speed. It has been of great use when teaching statistics to kids.
Each record in the dataset is a single ramen product review. Review numbers are contiguous: more recently reviewed ramen varieties have higher numbers. Brand, Variety (the product name), Country, and Style (Cup? Bowl? Tray?) are pretty self-explanatory. Stars indicate the ramen quality, as assessed by the reviewer, on a 5-point scale.
This data set consists of the marks secured by the students in various subjects. The inspirations are to understand the influence of the parents background, test preparation etc on students performance.
-
Data Mining
In this data, there are 2 hotel types. One of the hotels (H1) is a resort hotel and the other is a city hotel (H2). Both datasets share the same structure, with 31 variables describing the 40,060 observations of H1 and 79,330 observations of H2. Each observation represents a hotel booking. The tasks are to predict booking cancellations and analyze all factors that cause the cancellations.
-
Machine Learning
Based on the entity characteristics, we should classify them. Some ensemble models used to predict the dataset and combined them using Voting Classifier.
The dataset includes information related to reviews addressed by the customers and the venue rating.