This repository contains the Assignments done for the course CS 613 : Natural Language Processing course offerd at IIT Gandhinagar during Semster-1 2021-22.
In this assignment data was scrapped from twitter using the twint API. Tweets related to India on the discussing about topics of Pollution, Climate Change, Eco Friendly and Flood were scrapped.
Word cloud for data for each topic (i.e. Pollution, Climate Change, Eco Friendly and Flood) was produced. The word cloud for pollution is shown.
In this part a statistical analysis of the Data like frequency distribution of words, validiating the language annotation assigned by Twitter, fitting the Data with the Heap's Law.
According to Heap's Law, the size of vocabulary