Skip to content

Latest commit

 

History

History
61 lines (56 loc) · 2.18 KB

README.md

File metadata and controls

61 lines (56 loc) · 2.18 KB

Web Scrapping Project

Developed by

Description

  • In this project I created a python script to scrap technologies news from the Trybe's blog .

Stack

Development: Python, Docker, pymongo, beautifulsoup4 and MongoDB.

How to run the application with Docker (you need have already docker-compose installed in your machine)

Clone the repository

  git clone git@github.com:Rafaqfg/web-scraping-project-Python.git

Enter in the project folder

  cd web-scraping-project-Python

Create and activate the virtual environment for the project

  python3 -m venv .venv && source .venv/bin/activate

install the dependencies

  python3 -m pip install -r dev-requirements.txt

📌 Note: If during the installation you received some red error message just repeat the previous step until the error message is gone.

Up the Docker containers using the compose file (door 27017 must be avaible)

  docker-compose up -d

Run the menu.py file

   python3 tech_news/menu.py

Enjoy scrapping xD


📌 Note: All scrapped website is in portuguese, therefore you need to write your searches in portuguese.

Steps of development

description finished
Create the fetch function ✔️
Create the function scrape_novidades ✔️
Create the scrape_next_page_link function ✔️
Create the scrape_noticia function ✔️
Create the get_tech_news function to get the news! ✔️
Create the function search_by_title ✔️
create the function search_by_date ✔️
Create the function search_by_tag ✔️
Create the function search_by_category ✔️
Create the function top_5_news ✔️
Create the function top_5_categories ✔️
Create the menu function ✔️
Implement the menu features ✔️

Gif of the application