Feature: Deep Blacklisting #145

nautbot · 2018-03-31T03:00:41Z

Develop deep blacklisting job/script to consume and process XML/JSON feed created in #144 per undetermined technique

Reference issue #20 for original /u/nautbot functionality

nautbot · 2018-03-31T03:08:16Z

Suggested solution per @jpleger:

it might actually be something that can be done in scrapy and splash
which isn't too much effort to setup either
https://scrapy.org/
https://github.com/scrapy-plugins/scrapy-splash
I haven't used either projects in the last couple years, but scrapy was pretty easy to work with and with splash, it adds js support
https://doc.scrapy.org/en/latest/topics/link-extractors.html#link-extractors
can use the link extractors to find all links on a website, and then write a simple middleware that logs all redirects after the frontier crawl
(first page that is)

psineur · 2018-04-05T06:20:06Z

This needs re-triage to v1 v1.1 MVP+ or v3.
v2 was tech-only stack/platform agnostic

nautbot added enhancement New feature or request BIG Bigger features/tasks that will take some time to implement. labels Mar 31, 2018

nautbot added this to the v.2 - Independent Release milestone Mar 31, 2018

nautbot modified the milestones: v.2 - Independent Release, v.3 - Convenience & UI Apr 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Deep Blacklisting #145

Feature: Deep Blacklisting #145

nautbot commented Mar 31, 2018

nautbot commented Mar 31, 2018

psineur commented Apr 5, 2018

Feature: Deep Blacklisting #145

Feature: Deep Blacklisting #145

Comments

nautbot commented Mar 31, 2018

nautbot commented Mar 31, 2018

psineur commented Apr 5, 2018