Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for FlairTagger when using certain input formats. #62

Merged
merged 8 commits into from
Apr 9, 2024

Conversation

raykyn
Copy link
Contributor

@raykyn raykyn commented Apr 4, 2024

What's in the PR

  • In the old version of the script, I fixed the problem with whitespaces in a pretty hackish way. This commit cleans this up, which solves a bug that occured for me when using certain input formats, where my hack failed to work properly.
  • The old implementation also restricted the flair tagger to only tag tokens as defined by the whitespaces, this is now solved as well.

How to test manually

  • e.g. Import a text as CAS XMI in Inception, observe that interpunctuation will not be part of the span tagged by the recommender.

Automatic testing

  • PR includes unit tests
  • see my last PR about not being able to get the testing environment running

Documentation

  • PR updates documentation

…h flair 0.13.1 (both can use more-itertools 0.8.14 now)
… with certain input formats. It would also limit the flair tagger to only tag around sentence boundaries. This commit replace the hack with a conversion from the CAS tokens to flair token objects which is much cleaner. Multiple whitespaces between two tokens can now also properly be processed.
Copy link

codecov bot commented Apr 4, 2024

Codecov Report

Attention: Patch coverage is 0% with 28 lines in your changes are missing coverage. Please review.

Project coverage is 51.62%. Comparing base (9cca892) to head (6219a19).

Files Patch % Lines
ariadne/contrib/flair.py 0.00% 28 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #62      +/-   ##
==========================================
- Coverage   51.96%   51.62%   -0.35%     
==========================================
  Files          24       24              
  Lines         889      895       +6     
==========================================
  Hits          462      462              
- Misses        427      433       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@reckart
Copy link
Member

reckart commented Apr 9, 2024

see my last PR about not being able to get the testing environment running

Do you still have that problem?

@reckart reckart added the 🐛Bug Something isn't working label Apr 9, 2024
@reckart reckart added this to the 0.1.0 milestone Apr 9, 2024
@reckart reckart merged commit 02a7f7e into inception-project:main Apr 9, 2024
4 of 6 checks passed
@reckart
Copy link
Member

reckart commented Apr 9, 2024

@raykyn Please when you open a PR:

  • create an issue
  • make sure you have the latest version of our main branch
  • create a new branch e.g. bugfix/XXX-name-of-issue from our main branch
  • when you commit, please use the following format for you commit messages:
#XXX - Name of issue

- change 1
- change 2
- ...

That would greatly facilitate the merge process for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants