Skip to content

Web scraping robot API using Playwright with Flask (Python)

License

Notifications You must be signed in to change notification settings

renatamoon/horus_api

Repository files navigation

horus_api

Web scraping robot API using Playwright with Flask (Python)


About the project   |    Technologies   |    How to execute it   |    Images   |   

Webscraping challenge using Playwright library as a tool to web scrap, save it on database, finally getting data of notebooks from website https://webscraper.io/test-sites/e-commerce/allinone/computers/laptops.

Some functionalities present on this project:

* Webscraping with Playwright;
* Data insertion on MongoDB;
* Route to save data on MongoDB;
* Route to get data from MongoDB;

🟩 PROJECT STATUS: FINISHED


  • Python
  • Flask
  • MongoDB
  • Playwright

- Clone the repo with the following command: git clone https://github.com/renatamoon/horus_api.git


On Windows

- Create your virtual environment: python -m venv venv
- Activate your virtual environment: . venv\Scripts\Activate.ps1
Obs: If for any reason occurs and error: on powershell execute the following command: Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned
- Execute requirements with the command: pip install -r requirements.txt


On Linux:

- Create your virtual environment: python -m venv venv
- Activate your virtual environment: source venv/bin/activate
- Execute requirements with the command: pip install -r requirements.txt


Create a root .env file and change your local strings connections to do the properly connection

  • Create a database on your Local MongoDB: laptop_database
  • Create a collection on your Local MongoDB laptop_collection
MONGO_CONNECTION_URL="mongodb+srv://user:password@clustername.xxxxx.mongodb.net/?retryWrites=true&w=majority"
MONGODB_DATABASE_NAME="new_database"
MONGODB_COLLECTION="laptop_collection"

  • To Execute the application run the command: uvicorn main:app --reload

  • First use the following route to save data on your database: http:{your-host}/put/save_laptops ;
  • Then you can use the route http:{your-host}/get_all_laptops to get all data from the database;

  • Expected return of the route /put/save_laptops :

img.png

  • Expected return of the route get_all_laptops :

img_1.png

About

Web scraping robot API using Playwright with Flask (Python)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published