LoadMore Web Scraper 🚀

A Selenium & BeautifulSoup-based web scraper that automates dynamic content extraction from websites with "Load More" buttons. This script continuously clicks the button, scrapes article details, and stores them in an Excel file, ensuring incremental backups to prevent data loss.

📌 Features

✅ Automated Scrolling & Clicking – Handles infinite scrolling & dynamic "Load More" buttons.
✅ Customizable Target Website – Modify TARGET_URL to scrape different blogs or news sites.
✅ Data Backup & Export – Saves data incrementally in CSV & outputs the final data in Excel.
✅ Headless Mode – Runs in the background without opening a browser.
✅ Efficient & Scalable – Prevents infinite loops with a click limit (MAX_CLICKS).

📂 Project Structure

📦 Loadmore-Web-Scraper
│── 📄 scraper.py          # Main Python script for scraping
│── 📄 requirements.txt    # List of dependencies
│── 📄 README.md           # Project documentation
│── 📄 scraped_articles.xlsx  # Final extracted data (Generated after running the script)

🔧 Setup & Installation

1️⃣ Clone the Repository

git clone https://github.com/DataDiggerJay/Loadmore-Web-scrapper.git
cd Loadmore-Web-scrapper

2️⃣ Install Dependencies

Ensure you have Python 3.7+ installed. Then, run:

pip install -r requirements.txt

3️⃣ Run the Scraper

Simply execute the script:

python scraper.py

It will launch a headless browser, scrape articles, and store data in an Excel file (scraped_articles.xlsx).

🛠 Customization Guide

🔹 Change Target Website: Modify the TARGET_URL variable in scraper.py.
🔹 Scrape Different Data: Adjust the BeautifulSoup selectors to capture other elements.
🔹 Adjust Load More Clicks: Change MAX_CLICKS to control how much data is scraped.
🔹 Enable Browser View: Remove the --headless option in Selenium settings.

📊 Example Output (Excel File)

Title	href	Original Link
Example Blog Post 1	https://example.com/post1	https://example.com/post1
Example Blog Post 2	https://example.com/post2	https://example.com/post2

📜 License

This project is MIT Licensed – feel free to modify and use it for your own projects! 🎯

🤝 Contributing

Pull requests and improvements are welcome! Feel free to:
✅ Report Issues
✅ Add Features
✅ Improve Documentation

🌟 Show Your Support

If you found this project useful, please ⭐ Star this repository and Share it with others! 🚀

Happy Scraping! 🕵️‍♂️💻

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LoadMore Web Scraper 🚀

📌 Features

📂 Project Structure

🔧 Setup & Installation

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Run the Scraper

🛠 Customization Guide

📊 Example Output (Excel File)

📜 License

🤝 Contributing

🌟 Show Your Support

Files

README.md

Latest commit

History

README.md

File metadata and controls

LoadMore Web Scraper 🚀

📌 Features

📂 Project Structure

🔧 Setup & Installation

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Run the Scraper

🛠 Customization Guide

📊 Example Output (Excel File)

📜 License

🤝 Contributing

🌟 Show Your Support