Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtranslating capitals AA's #1608

Open
jrbowden opened this issue Aug 8, 2024 · 1 comment
Open

Backtranslating capitals AA's #1608

jrbowden opened this issue Aug 8, 2024 · 1 comment
Labels
back-translation Anything related to backward translation

Comments

@jrbowden
Copy link
Contributor

jrbowden commented Aug 8, 2024

In UEB, any word written fully capitalized is terminated by a non-alphabetic sign, see Rules of Unified English Braille 8.4.2.

Thus:
AA's should be written in UEB as ⠠⠠⠁⠁⠄⠎
AA'S should be wrriten in UEB as ⠠⠠⠁⠁⠄⠠⠎

Currently, backtranslating, Liblouis outputs:
(incorrect:) ⠠⠠⠁⠁⠄⠎ AA'S

Test:
- [AA's, ⠠⠠⠁⠁⠄⠎, xfail: backtranslation fails]

There are many examples in en-ueb-g2-dictionary_harness.yaml

How can you specify that after a non-alphabetic, the capital word is terminated?

@bertfrees bertfrees added the back-translation Anything related to backward translation label Aug 19, 2024
@tibbsa
Copy link
Contributor

tibbsa commented Nov 11, 2024

Not a solution yet, but some more info on this.

This appears to happen specifically for 's endings, because of this line in ueb-en-charsdefs.uti:

endword 's 3-234

Note that the same thing happens with endings 'd, 'll, 'm, 'n, 're, s', 't, and 've, for the same reason.

But this isn't an engine problem per se, because but for those table rules, it does what you would expect. For other endings, albeit gibberish in the English context, it does properly recognize the end-of-caps, but it appears the endword rule is not being recognized as something that should terminate caps mode first, probably because the punctuation is a part of that rule itself.

Consider that DOG'd fails the translation test, but DOG'y works fine (there being no endword rule for 'y):

$ tools/lou_trace -b en-ueb-g1.ctb

,,dog'd
DOG'D

,,dog'y
DOG'y

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back-translation Anything related to backward translation
Projects
None yet
Development

No branches or pull requests

3 participants