These methods in particular require serious refactoring because they became monsters and have serious inconsistencies:
crawl() which is the main method that starts the crawling/extraction process.
downloadURL() in commandImpl class in the commandShell module (separating commandImpl from commandShell in already an issue, see previous postings). This has become a monster in terms of LOC.
render() method in htmlRendering. This method has: 1) obsolete parameters, 2) different ways of passing parameters (via instance variables and arguments) 3) is too large.
apply() method in xRules.py