You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is an unresolved issue when parsing for urls that bleed into regular text (often because of rich text features like tables etc.).
For example,
https://www.example.com/index.html.Beginning_of_following_paragraph which could be resolved by accepting only one period after the url, except that
https://www.example.com/index.htmlBeginning_of_following_paragraph would still not be resolved.
I think an easier solution might be to offer some optional cleaning functions for the dataframes that archivr produces, but there could be other ideas.
The text was updated successfully, but these errors were encountered:
There is an unresolved issue when parsing for urls that bleed into regular text (often because of rich text features like tables etc.).
For example,
https://www.example.com/index.html.Beginning_of_following_paragraph
which could be resolved by accepting only one period after the url, except thathttps://www.example.com/index.htmlBeginning_of_following_paragraph
would still not be resolved.I think an easier solution might be to offer some optional cleaning functions for the dataframes that archivr produces, but there could be other ideas.
The text was updated successfully, but these errors were encountered: