URL Toolbox Update #15
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds update mechanism for #4. When executed, the ut_parse_extended lookup will attempt to update the Mozilla and IANA lists if the files have not been updated in 30 days. This likely also addresses #5. A custom lookup could be added that uses native WILDCARD lookup capabilities to extract the Mozilla TLD.
The changes change use of
list="*"for the extended lookup; it defaults to "iana" if a value of "iana", "icann", "mozilla", or "custom" is not used. The custom lookup can be modified to provide similar capabilities to list="*".For #7, the use of the publicsuffixlist package fixes that issue, but the output is not the same as what @dbranger listed. If the TLD is not in the selected list, "None" is returned (accept_unknown=False). This could be modified to accept unknown TLDs, or to only reject unknown TLDs for the
ut_tldfield.The
ut_bayesianexport is now a JSON object, so the values per ngram can be distinguished.I recommend making a version change to 1.10.0 or higher, due to the changes above.