Objective:
- This Python script demonstrates web scraping using Selenium from an e-commerce website, specifically "https://webscraper.io/test-sites/e-commerce/static/" or "https://www.saucedemo.com/inventory.html".
- It scrapes laptop, clothes and others e-commerce product data from multiple pages based on user input.
Tools Used:
- Selenium WebDriver: Automates web browser interaction.
- Python Libraries:
pandas: For data manipulation and DataFrame creation.
time: For adding delays between actions.
Steps:
WebDriver Setup:
- Imports necessary modules and sets up the WebDriver with Chrome. Navigation:
- Navigates to the target webpage. Data Scraping:
- Finds and clicks on elements to access laptop listings.
- Iterates over the specified number of pages.
- Extracts laptop data such as name, price, description, and rating. Data Processing:
- Stores scraped data in a list of dictionaries.
- Creates a pandas DataFrame from the list.
- Outputs the DataFrame to an Excel file named "selenium_scraping_data.xlsx".
Insights:
- Data Quantity: The script allows for scraping laptop data from multiple pages, providing a potentially large dataset.
- Data Quality: By extracting attributes like name, price, description, and rating, detailed information about each laptop can be collected.
- Automation: Utilizing Selenium WebDriver streamlines the scraping process, making it efficient and suitable for larger datasets.
- Data Export: The scraped data is saved in an Excel file, facilitating easy access and further analysis.
This script serves as a practical example of web scraping with Selenium, showcasing its capabilities for extracting structured data from dynamic web pages.