GitHub - hamed-elfayome/websites-Data-Extractor: collecting data from web pages and books using web scraping techniques.

This repository contains scripts for collecting data from web pages and books using web scraping techniques.

Web Scraping

Scraping Medical Data from a Website - pharmacy website

The collect_med.py script retrieves medical data from a website by iterating over a range of IDs and extracting information from each corresponding webpage. The extracted data includes attributes such as name, price, and company. The collected data is saved into a CSV file named items.csv.

Scraping Book Metadata from Project Gutenberg - books website

The getbook.py script extracts metadata from books available on Project Gutenberg. It retrieves information such as the book title and language by accessing the text files of each book. The collected metadata is saved into a CSV file named books-dectionary.csv, and the text content of each book is saved into separate text files within the Books folder.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
books website		books website
pharmacy website		pharmacy website
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping

About

Languages

hamed-elfayome/websites-Data-Extractor

Folders and files

Latest commit

History

Repository files navigation

Web Scraping

About

Resources

Stars

Watchers

Forks

Languages