This repository contains Python-based scrapers for Carsandbids.com search results and product pages. These scrapers leverage the Crawlbase Crawling API to handle JavaScript rendering, CAPTCHA challenges, and anti-bot protections. The extracted data is processed using BeautifulSoup for HTML parsing and Pandas for structured storage.
➡ Read the full blog here to learn more.
The Carsandbids.com Search Results Scraper (carsandbids_serp_scraper.py) extracts:
- Product Name
- Subtitle
- Auction Location
- Thumbnail
- Product Page Link
It also automatically handles pagination, ensuring comprehensive data extraction. It saves the extracted data in a JSON file.
The Carsandbids.com Product Page Scraper (carsandbids_product_page_scraper.py) extracts detailed car information, including:
- Auction Title
- Vehicle Description
- Image Gallery
- Current Bid
- Bid History
- Seller Information
It saves the extracted data in a JSON file.
Ensure that Python is installed on your system. Check the version using:
# Use python3 if you're on Linux with Python 3 installed
python --version
Next, install the required dependencies:
pip install crawlbase beautifulsoup4
- Crawlbase – Handles JavaScript rendering and bypasses bot protections.
- BeautifulSoup – Parses and extracts structured data from HTML.
-
Get Your Crawlbase Access Token
- Sign up for Crawlbase here to get an API token.
- Use the JS token for Carsandbids.com scraping, as the site uses JavaScript-rendered content.
-
Update the Scraper with Your Token
- Replace
"CRAWLBASE_JS_TOKEN"
in the script with your Crawlbase JS Token.
- Replace
-
Run the Scraper
# Use python3 if required (for Linux/macOS)
python SCRAPER_FILE_NAME.py
Replace "SCRAPER_FILE_NAME.py"
with the actual script name (carsandbids_serp_scraper.py
or carsandbids_product_page_scraper.py
).
- Expand scrapers to extract additional product details.
- Optimize data storage and export formats (e.g., JSON, database integration).
- Enhance scraper efficiency and speed.
- Bypasses anti-bot protections with Crawlbase.
- Handles JavaScript-rendered content seamlessly.
- Extracts accurate and structured product data efficiently.