Skip to content

Latest commit

 

History

History
25 lines (10 loc) · 678 Bytes

README.md

File metadata and controls

25 lines (10 loc) · 678 Bytes

Project Screwpie

This project is a toy project of web crawler specifically targeted on douban group and zhihu.

####The roadmap of this toy project may include tasks below:

  1. Regenerate the core tag searching implementation using bs4. (done)

  2. Crawl a step deeper: reveal social networking graph of author and followers.

  3. Connect the crawler with a SQL database instead of current CSV file.(done)

  4. Provide a progress bar.(done)

  5. Add multi-threading support.

  6. Add a front-end to visualize data scraped using D3.js (or other library)

  7. Decoupling the orginal one file into classes.(done)