- Web scraping via API request using Scrapy web scraping framework
- Extracting mobile phones price and details from Lazada Malaysia
- Scraping framework : Scrapy
- Scraping through API, extracting data from JSON
- Please see the generated scraped data in scraped data
- Scrapy 2.7.1
- Python 3.7 or above
- Any working environment to install the required packages such as conda or pyenv.
- The working directory is Lazada-Web-Scraping-Handphone-Price-List/lazada_phone_listing/lazada_phone_listing/spiders
- Activate the installed working environment
- Run the main.py in the working directory.
- Run in the terminal in the working directory OR simply run
- Add -O lazada_mobilephone_list.csv in cli to produce the csv file e.g. 'scrapy runspider main.py --O lazada_mobilephone_list.csv'
- Clone the repository
git clone https://github.com/allifizzuddin89/Lazada-Web-Scraping-Handphone-Price-List.git
- Create working environment (skip if already have any working environment)
conda create --name scraping_env -c conda-forge python=3.9.13 scrapy=2.7.1
- Activate the working environment
conda activate scraping_env
- Run the spider
scrapy runspider Lazada-Web-Scraping-Handphone-Price-List/lazada_phone_listing/lazada_phone_listing/spiders/main.py -O lazada_mobilephone_list.csv
- Error might happened due to the cookies already expired or request being rejected by the server or the url simply has been changed by the administrator.
- Please bear in mind, the the web owner might change the web's code dynamically anytime. Therefore this web scraping code might not work.
- Solution:
- Refresh the cookies (if any) OR
- Using proxy (refer main.py)
- Replace with new url
- This work only meant for educational, research and proof of work purpose only.
- I will not responsible for any illegal activities.
- Every action is on your own responsibilities.