Data wrangling project based on Twitter data
- Gathering data from different sources: Flat file, URL, API
- Handling Data Quality and Tidiness Issues
This project is mainly focuses on how I, as a data analyst, get the proper data. In this analysis, I collected datasets from Udacity URL, directly-downloaded flatfiles and also using Tweeter API. In the second part, I made a small report to answer the following questions based on the obtained datasets.
- The popularity of each dog "stage" (i.e. doggo, floofer, pupper, and puppo)
- The method of accessing to twitter
- The number of counts for retweet and favorite to get insight into popularity of tweets
- Relationship between retweet_count and favorite_count
- The proportion of image predictions that predict dog images as the first stage
Getting Twitter Data in Python
Accessing the Twitter API with Python
Learn Python by analyzing Donald Trump’s tweets