Simple framework: Dispatcher -> URL Manager -> URL Downlader -> Web Resolver(BeautifulSoup) -> result.html
- (Spider Adapter)spider_main.py
- (URL manager)url_manager.py
- (Downlaoder)html_downloader.py
- (Resolver)html_parser.py
- (Output result)html_outputer.py
Just run spider_main.py. You can see result "result.html" in current dictionary