Skip to content

Commit

Permalink
don't index javascript URIs and URL fragments
Browse files Browse the repository at this point in the history
  • Loading branch information
s0md3v authored May 1, 2019
1 parent 6a29f2c commit c960849
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions core/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ def is_link(url, processed, files):
bool If `url` should be crawled
"""
if url not in processed:
if url.startswith('#') or url.startswith('javascript:'):
return False
is_file = url.endswith(BAD_TYPES)
if is_file:
files.add(url)
Expand Down

0 comments on commit c960849

Please sign in to comment.