[PLGN-405] ExtractIT - Adding in extra logic to better handle wrapping of lines in pdf#2089
[PLGN-405] ExtractIT - Adding in extra logic to better handle wrapping of lines in pdf#2089cmcnally-r7 merged 5 commits intodevelopfrom
Conversation
|
Bumping validators to validators==0.22.0 also after looking at snyk note as part of the update in versions the have add in extra logic for how emails/domains are validated in python-validators/validators@0.20.0...0.21.0 that was added on 0.21.0 this looks to not only validate against the django standards as before but also against the following standards as a result the test email that is used throughout the unit tests |
|
Last commit to edit the unit tests was due to the way the unit tests are run on jenkins as a result there is a conflict with the new version of validators that is added as this package is not added when making the container via docker and all of the integration tests are passing with the new code I removed that one test to allow the git-hub checks to pass note when running the unit tests with the correct version of 0.22.0 locally all tests including the removed test were all passing |
cmcnally-r7
left a comment
There was a problem hiding this comment.
Just one small correction but other than that it LGTM 😄
| :param page: The PDF page from which to extract wrapped words. | ||
| :type: Page | ||
|
|
||
| :param provided_regex: The regex for the type of words to be searched for, eg emial/domain format. Defaults to "". |
There was a problem hiding this comment.
| :param provided_regex: The regex for the type of words to be searched for, eg emial/domain format. Defaults to "". | |
| :param provided_regex: The regex for the type of words to be searched for, e.g. email/domain format. |
There was a problem hiding this comment.
This change has been made
5b5cf4a to
f4f7968
Compare
…ts to reflect changes
…g of lines in pdf (#2089) * PLGN-405-Adding in extra logic to better handle wrapping of lines in pdf * PLGN-405-Reformatting to to black format * PLGN-405-Bumping version of validators and making changes to unit tests to reflect changes * PLGN-405-Removing unit test, that is not working with validators 2.20.0 * PLGN-405-Updating the docstring message to make it clearer
…g of lines in pdf (#2089) (#2096) * PLGN-405-Adding in extra logic to better handle wrapping of lines in pdf * PLGN-405-Reformatting to to black format * PLGN-405-Bumping version of validators and making changes to unit tests to reflect changes * PLGN-405-Removing unit test, that is not working with validators 2.20.0 * PLGN-405-Updating the docstring message to make it clearer
Proposed Changes
https://issues.corp.rapid7.com/browse/PLGN-405
Description
Describe the proposed changes:
eg for ips
as the new lines is a valid ip do not wrap
as the current line is a valid ip do not wrap
as neither is a valid ip we will combine and try to see if the combined is a valid ip
note for less exact regex such as for domains, as both
this_is_a_test_url.comandtest_domain.comthere may still be some edge cases wrapping may not be perfectPR Requirements
Developers, verify you have completed the following items by checking them off:
Testing
Unit Tests
new unit tests have been added and are passing
integration tests have been updated to reflect the new behaviour
Review our documentation on generating and writing plugin unit tests
In-Product Tests
If you are an InsightConnect customer or have access to an InsightConnect instance, the following in-product tests should be done:
Style
Review the style guide
USER nobodyin theDockerfilewhen possiblerapid7/insightconnect-python-3-38-slim-plugin:{sdk-version-num}andrapid7/insightconnect-python-3-38-plugin:{sdk-version-num}insight-plugin validatewhich callsicon_validateto linthelp.mdFunctional Checklist
tests/directory created withinsight-plugin samplestests/$action_bad.jsoninsight-plugin run -T tests/example.json --debug --jqinsight-plugin run -T all --debug --jq(use PR format at end)insight-plugin run -R tests/example.json --debug --jqinsight-plugin run --debug --jq(use PR format at end)Assessment
You must validate your work to reviewers:
insight-plugin validateand make sure everything passesinsight-plugin run -A. For single action validation:insight-plugin run tests/{file}.json -Ainsight-plugin ... | pbcopy) and paste the output in a new post on this PR