-
Notifications
You must be signed in to change notification settings - Fork 485
OPENNLP-1782: Add tagging examples to verify French POS model #863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thx @meriam2303 for the PR. Let's see if it passes the tests. |
@meriam2303 It seems there is a syntax error:
Could you check, correct it and push a fix to the same branch? Please also add a new static constant to that test:
See other constants close to the class definition (POLISH, GERMAN, ENGLISH...) |
Hi, There are still some indentation errors for the French data.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed checkstyle but currently fails due to missing model (?)
Error: Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.277 s <<< FAILURE! -- in opennlp.tools.postag.POSTaggerMEIT
Error: opennlp.tools.postag.POSTaggerMEIT.testPOSTagger(String, int, String, String[])[8] -- Time elapsed: 0.005 s <<< ERROR!
java.lang.NullPointerException: Cannot invoke "opennlp.tools.tokenize.Tokenizer.tokenize(String)" because the return value of "java.util.Map.get(Object)" is null
at opennlp.tools.postag.POSTaggerMEIT.testPOSTagger(POSTaggerMEIT.java:66)
Fixed the test setup. Now we have
which should be AUX according to the provided reference. Think it is an edge case here: Actually, faisait is the imperfect of “faire”. Here it functions as a semi-auxiliary in faisait souffrir. Tagging it as AUX is acceptable in Universal Dependencies because “faire” + infinitive is considered an auxiliary construction. (not a native French speaker though). Tried some other (online) taggers, which will label |
French Model draft to test + Arabic & Maghrebi commented out
For documentation related changes: