A distributed Sina Weibo Search spider based on Scrapy, Redis and MongoDB. And for the crawled page, extract user info, forward info and pictures and so on.
##Reference scrapy-redis
$ sudo apt-get install mongodb
$ sudo apt-get install redis-server
$ sudo apt-get install pymongo
$ sudo pip install -r requirements.txt
- put your keywords in items.txt(just for test for me). Also, you can read keywords from mysql.
scrapy crawl weibosearch -a username=your_weibo_account -a password=your_weibo_password
- you can test the process of parsing locally, see weibosearch/spiders/tests.py for more
- add another spider with scrapy crawl weibosearch -a username=another_weibo_account -a password=another_weibo_password
=======