- Scrape the restaurant name, url, dish name and price of each of the item on the menu listed on Foodpanda & DeliverEat for a given location, visualize them using metabase
- Please read this article before using this
- See example
JSON
output here
- Python 3.6+
- Install all the dependencies using pipenv
- Download & install splash on your machine, this is used to scrape dynamic contents
Scraping the entire restaurants is going to take some time, for tryout/debug/development, add DEBUG=True
to your command, e.g.:
# JSON output
DEBUG=True scrapy crawl foodpanda -o food_delivery_scrapy/output/foodpanda.json
# SQL output
DEBUG=True scrapy crawl foodpanda
Set PROXY_POOL_ENABLED = True
at settings.py
to use proxy pool
scrapy crawl foodpanda -o food_delivery_scrapy/output/foodpanda.json
# Get the URLs of all the available restaurants
scrapy crawl get_delivereat_restaurants
# Get the final data
scrapy crawl delivereat -o food_delivery_scrapy/output/delivereat.json
- Make sure your
postgresql
is running createdb food_delivery_scrapy
- Make sure
splash
is running
scrapy crawl get_delivereat_restaurants # You only need to run this once
scrapy crawl delivereat
scrapy crawl foodpanda