This repository contains code and data related to various experiments with Kabum Digital data, including the generation of JSON files and a podcast. Data is retrieved using the unofficial-kabum-digital-api, which must be running on your local machine.
entity_rank.json: JSON file related to entity ranking.kabumPostsData.json: The main dataset used in these experiments, retrieved using the unofficial Kabum Digital API.podcast/: Contains audio files generated using the "tts.py" script.tts.py: Python script for text-to-speech (TTS) functionality, used to generate audio for the podcast.index.py: Generate all the JSON files.most_used_words_and_phrases.json: JSON file with data about the most used words and phrases.requirements.txt: Lists the required dependencies for the project.view_rank.py: Python script for viewing ranking information.
To get started with this project, you need to have the unofficial-kabum-digital-api running on your local machine. Make sure you have the necessary dependencies installed as specified in the requirements.txt file.
In order to not mess with our normal environment, it's a practice to create a virtual environment which is isolated from others python libs.
- Install virtualenv
pip install virtualenv- usage:
python<version> -m venv <virtualenv-environment-name>Example:
python3 -m venv kabum-experiments-env- Activate the venv:
source <virtualenv-environment-name>/bin/activate- Deactivate the venv:
deactivatepip install -r requirements.txtHere are some key usage instructions:
- Use
index.pyto generate all the necessary data (JSON files). - Explore the data and insights in the
kabumPostsData.jsonfile, which is retrieved using the unofficial Kabum Digital API. - The "podcast" folder contains audio files generated from the text data using the
tts.pyscript. - If you need to interact with Kabum's digital API, refer to unofficial-kabum-digital-api for more details.