📚 Book Scraper

Scrape books like a boss! 🚀

Extract book data from books.toscrape.com with style

🎯 What's This?

A super cool web scraper that grabs book info from an online bookstore. Think of it as your personal digital librarian that works 24/7! 📖

✨ What You Get

📚 Book titles, prices, ratings
📊 Clean, organized data
💾 Export to CSV, Excel, JSON
🎯 Error handling & retries
⚡ Fast & respectful scraping

🚀 Quick Start

# 1. Clone it
git clone <your-repo-url>
cd Book-scraping

# 2. Install stuff
pip install -r requirements.txt

# 3. Run it!
jupyter notebook book_scraper_optimized.ipynb

That's it! 🎉

🎮 How to Use

Option 1: Jupyter Notebook (Recommended)

jupyter notebook book_scraper_optimized.ipynb

Perfect for learning and experimenting

Option 2: Command Line

py book_scraper.py --max-pages 5

For the terminal warriors

Option 3: As a Module

from book_scraper import BookScraper
scraper = BookScraper()
data = scraper.scrape_all_books(max_pages=10)

For the code ninjas

📊 Sample Output

📚 Total books: 1000
💰 Price range: £10.00 - £59.99
⭐ Rating Distribution:
  Three: 250 books
  Four: 200 books
  Five: 180 books
📦 Availability: 950 in stock, 50 out of stock

🛠️ What's Inside

Book-scraping/
├── 📄 book_scraper_optimized.ipynb  # Main notebook
├── 🐍 book_scraper.py               # CLI script
├── ⚙️ config.py                     # Settings
├── 🔧 utils.py                      # Helpers
├── 🧪 test_scraper.py               # Tests
└── 📋 requirements.txt              # Dependencies

🐛 Troubleshooting

Problem: ModuleNotFoundError: No module named 'bs4' Solution: pip install -r requirements.txt

Problem: Website not responding Solution: Check your internet connection

Problem: Rate limited Solution: Increase delay: --delay 2

🤝 Contributing

Fork it 🍴
Create a branch 🌿
Make changes ✏️
Submit PR 🚀

Ideas welcome! 💡

⚠️ Disclaimer

For educational purposes only! Always be respectful when scraping websites. Use delays and follow robots.txt! 🤖

🌟 Star the Repository

If you find this project helpful, please give it a ⭐ on GitHub!

📞 Connect & Support

Made with ❤️ and ☕ by Jonathan Thota

Scraping the web, one book at a time! 📖

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📚 Book Scraper

🎯 What's This?

✨ What You Get

🚀 Quick Start

🎮 How to Use

Option 1: Jupyter Notebook (Recommended)

Option 2: Command Line

Option 3: As a Module

📊 Sample Output

🛠️ What's Inside

🐛 Troubleshooting

🤝 Contributing

⚠️ Disclaimer

🌟 Star the Repository

📞 Connect & Support

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
book_scraper.py		book_scraper.py
book_scraper_optimized.ipynb		book_scraper_optimized.ipynb
config.py		config.py
requirements.txt		requirements.txt
test_scraper.py		test_scraper.py
utils.py		utils.py

License

jonathanrao99/Book-scraping

Folders and files

Latest commit

History

Repository files navigation

📚 Book Scraper

🎯 What's This?

✨ What You Get

🚀 Quick Start

🎮 How to Use

Option 1: Jupyter Notebook (Recommended)

Option 2: Command Line

Option 3: As a Module

📊 Sample Output

🛠️ What's Inside

🐛 Troubleshooting

🤝 Contributing

⚠️ Disclaimer

🌟 Star the Repository

📞 Connect & Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages