Twitter Toolbox

Welcome to the Twitter Toolbox, a comprehensive suite designed to simplify data acquisition, preprocessing, and analysis from Twitter. This project is an up-to-date solution built in response to the recent changes in Twitter's API and front end. Given that several existing libraries are no longer maintained or updated, this Twitter Toolbox ensures a seamless data extraction process for data analysts, researchers, marketers, and developers alike.

Features

The Twitter Toolbox offers a broad spectrum of functionalities, including:

Data Acquisition: Our toolbox equips you with everything you need to extract a variety of data from Twitter, from streaming and scraping real-time data to making API calls and hydrating or dehydrating tweets.

Preprocessing: Our tools offer data cleaning, language filtering, data labeling, and group generation features to refine your dataset for accurate and reliable analyses.

Natural Language Processing (NLP): The toolbox is equipped with sentiment analysis, emotion analysis, topic analysis, and named entity recognition to provide you with meaningful insights from the content of tweets.

Each of these capabilities is designed to help you make the most out of Twitter data, whether you're exploring public sentiment, detecting emotional trends, identifying key themes, or recognizing named entities such as organizations or individuals.

Articles

I have written a series of articles to explain how to use the Twitter Toolbox. You can find them here:

Data Acquisition

Collect data from Twitter using scraping, streaming and Twitter API.

Learn more about the data collection here.

Preprocessing

In progress...

NLP

In progress...

Future Developments

The Twitter Toolbox is an evolving project. We plan to continue adding new features as they are developed. Stay tuned for regular updates and improvements!

Contributions and Feedback

This toolbox is designed to grow with the contributions and feedback from the community. You are welcome to suggest new features, report any issues, or even submit pull requests. Let's collaborate to create the most valuable Twitter Toolbox possible!

Disclaimer

Please note that the use of the Twitter API and all data retrieved through this toolbox should comply with the Twitter Terms of Service, Developer Agreement, and Developer Policy, including Twitter's privacy policy. This project includes a dehydration script to comply with Twitter's terms of service, allowing for sharing only the tweet_id. Always de-identify the information and respect user privacy when sharing or publishing data.

Structure

Project is structured as follows:

├── data (Data is not stored in the repository)
├── src
│   ├── dataAcquisition
│   ├── preprocessing
│   ├── nlp
├── docs 
└──

Data is stored in the following structure:

├── data
│   ├── <scraping> (Scrape from user, hashtag or keyword)
│   │   ├── <user>
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   └── ...
│   │   ├── <user>
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   └── ...
│   │   └── ...
│   ├── <sample-stream> (Stream 1% of tweets)
│   │   ├── <date>.csv
│   │   ├── <date>.csv
│   │   └── ...
│   ├── <covid-github> (Scrape from Github and rehydrate)
│   │   ├── <date>.csv
│   │   ├── <date>.csv
│   │   └── ...
│   └──
└──

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
README.md		README.md
chromedriver_setup.sh		chromedriver_setup.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Twitter Toolbox

Table of Contents

Features

Articles

Data Acquisition

Preprocessing

NLP

Future Developments

Contributions and Feedback

Disclaimer

Structure

About

Uh oh!

Releases

Packages

Uh oh!

Languages

sferez/Twitter_Toolbox

Folders and files

Latest commit

History

Repository files navigation

Twitter Toolbox

Table of Contents

Features

Articles

Data Acquisition

Preprocessing

NLP

Future Developments

Contributions and Feedback

Disclaimer

Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages