- In this project I created a python script to scrap technologies news from the Trybe's blog .
Development: Python, Docker, pymongo, beautifulsoup4 and MongoDB.
How to run the application with Docker (you need have already docker-compose installed in your machine)
Clone the repository
git clone git@github.com:Rafaqfg/web-scraping-project-Python.git
Enter in the project folder
cd web-scraping-project-Python
Create and activate the virtual environment for the project
python3 -m venv .venv && source .venv/bin/activate
install the dependencies
python3 -m pip install -r dev-requirements.txt
📌 Note: If during the installation you received some red error message just repeat the previous step until the error message is gone.
Up the Docker containers using the compose file (door 27017 must be avaible)
docker-compose up -d
Run the menu.py file
python3 tech_news/menu.py
📌 Note: All scrapped website is in portuguese, therefore you need to write your searches in portuguese.
description | finished |
---|---|
Create the fetch function | ✔️ |
Create the function scrape_novidades | ✔️ |
Create the scrape_next_page_link function | ✔️ |
Create the scrape_noticia function | ✔️ |
Create the get_tech_news function to get the news! | ✔️ |
Create the function search_by_title | ✔️ |
create the function search_by_date | ✔️ |
Create the function search_by_tag | ✔️ |
Create the function search_by_category | ✔️ |
Create the function top_5_news | ✔️ |
Create the function top_5_categories | ✔️ |
Create the menu function | ✔️ |
Implement the menu features | ✔️ |