Skip to content

shishirdhakal123/Retail-Web-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Retail Web Scraper πŸ›’

This repository contains a sample web scraping script developed as part of my academic capstone project at Deakin University, Australia.

The project aimed to explore browser automation, data extraction techniques, and ethical scraping practices in a controlled, academic environment. This repository is for demonstration purposes only and is not intended for reuse, deployment, or execution.


πŸ” Project Summary

  • Designed to simulate human browsing behavior to access publicly visible product data
  • Focused on techniques like session rotation, CAPTCHA handling, and API parsing
  • Data storage via MongoDB and export to structured formats
  • All scraping flows were tested within the bounds of ethical research and responsible automation

βš™οΈ Technologies Used

  • Python
  • Selenium Wire & Undetected Chromedriver
  • Smartproxy (residential, rotating)
  • MongoDB
  • Chrome WebDriver

⚠️ Disclaimer and Legal Notice

This repository is made publicly visible only as a portfolio showcase of my technical and academic experience.

It must not be cloned, executed, or reused for scraping any real-world websites.

By viewing this repository, you agree to the following:

  • The author does not condone or promote illegal scraping or violation of any website's terms of service
  • This code is not provided as a tool or framework for others to scrape websites
  • The repository excludes configuration files, credentials, and execution dependencies
  • The author is not liable for any for any misuse or unauthorized use of the code

πŸ“‚ Repository Structure

Retail-Web-Scraper/ β”œβ”€β”€ scraper_coles.py # Main academic scraper script β”œβ”€β”€ .gitignore # Excludes sensitive/confidential files

πŸ‘€ Author

Shishir Dhakal
πŸ“ Melbourne, Australia
πŸŽ“ Postgraduate Student – IT Management
🌐 shishirdhakal.com
πŸ”— LinkedIn

For a full walkthrough and explanation, see the blog:
https://shishirdhakal.com/coles-web-scraper-project/

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages