You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The build scripts for Jitendex assume that if all of the kanji forms associated with a reading are tagged as rare, outdated, or irregular, then the reading should be treated as its own surface form.
There are some corner cases in which this assumption is incorrect. For example, "にんげんドッグ" is not a surface form.
This error in particular can be avoided programmatically by paying closer attention to the metadata tags. Since 人間ドッグ and にんげんドッグ are both irregular due to the kana usage, I can detect that "にんげんドッグ" should not be treated as a surface form in this case.
The text was updated successfully, but these errors were encountered:
This will be fixed in the next version of jitendex.
I am no longer treating readings as discrete surface forms if the readings are also rare, outdated, or irregular.
The build scripts for Jitendex assume that if all of the kanji forms associated with a reading are tagged as rare, outdated, or irregular, then the reading should be treated as its own surface form.
There are some corner cases in which this assumption is incorrect. For example, "にんげんドッグ" is not a surface form.
This error in particular can be avoided programmatically by paying closer attention to the metadata tags. Since 人間ドッグ and にんげんドッグ are both irregular due to the kana usage, I can detect that "にんげんドッグ" should not be treated as a surface form in this case.
The text was updated successfully, but these errors were encountered: