Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling text that implements navigation of character sequences properly #4

Closed
ultrasound1372 opened this issue Jul 20, 2021 · 9 comments

Comments

@ultrasound1372
Copy link

Some edit fields, like notepad, do not handle combining characters as unicode expects them to; one may navigate through each character in the sequence. This is especially obvious for emoji. This add-on assumes that all edit fields are like that. So when encountering a text edit or display that properly handles navigation of character sequences that are canonically one character, one receives N/A when attempting to get info.
Some examples of such characters:

  • o͡l̪ (combining markings that cannot be precomposed)
  • ö (a character with a combining modifier that has a precomposed form [ö])
  • 👨‍🎤 (an emoji ZWJ sequence)

Please try and do something about this if you can. I'd think of comma delimiting such instances in the table, so the second example could be "LATIN SMALL LETTER O, COMBINING DIAERESIS". If this is hard to do or if you would prefer, you could also create multiple tables; one for each character.
As for the emoji I suppose presenting the query to the CLDR would return a valid name, so if you receive a sequence I guess you should try posing the entire sequence first when retrieving the CLDR name.
I have encountered such text within the windows 10 app Unigram, both within the edit field and when using the review cursor on messages, so it is also a behavior of NVDA. As I said, the unicode annex actually defines that to be the proper way to navigate such sequences.

@CyrilleB79
Copy link
Owner

Having N/A's is surely an issue. And I will try to fix it at first.

Afterwards, I will have to figure the issue of NVDA itself.
Have you opened an issue against NVDA regarding this topic? If yes, could you give the reference? If not, could you do it? Thanks.

@CyrilleB79
Copy link
Owner

Hi @ultrasound1372

I guess the majority of the combined character issues have been addressed in c631f9f.

The only remaining problem is the emoji ZWJ sequence. However, it seems to be 3 characters (also visually). I'd prefer the issue fixed in NVDA if possible.

Would you be able to test the add-on version in the master branch and confirm? I have not yet done a release.

@ultrasound1372
Copy link
Author

I'm trying to build it with scons, but you require the use of scripts.generateDataFR in your build script, and I can't seem to get this to run. First I get module not found, and upon fudging an init file in scripts to create a package so the import works, I get module not found on globalPlugin. How does your build workflow work and can you make it easier to use?

@CyrilleB79
Copy link
Owner

Thanks for your trial.

It was working (and is still working) on my setup just launching the scons command. But there was something strange regarding the import and I do not fully understand why it was working this way. Anyway, I have fixed the package issue in master branch adding the __init__.py and changing the imports in the sconstruct file.

Could you try again? Is there still an error with globalPlugin? If yes, could you copy here the traceback so that I can understand the cause and fix it? Thanks.

@ultrasound1372
Copy link
Author

Yes, the handling of multiple combining characters (such as a̋̇) also functions correctly. The only thing I'd say should be changed is in the second table when you list info from the symbols dictionary your sub-columns are not themselves marked as headers. So when navigating around I hear character 0 read multiple times but not replacement, preserve, and level. Just value. I think you should get rid of the value column altogether and just make it look something like this. Although keep your row headers as row headers, GFM just doesn't let me make row headers.

Attribute Replacement Level Preserve
Symbol description space character never
Symbol description in user file (en_US) [Not defined] [Not defined] [Not defined]
Symbol description in English file space character [Not defined]
Symbol description in English CLDR file [Not defined] [Not defined] [Not defined]

@CyrilleB79
Copy link
Owner

Yes, the handling of multiple combining characters (such as a̋̇) also functions correctly. The only thing I'd say should be changed is in the second table when you list info from the symbols dictionary your sub-columns are not themselves marked as headers. So when navigating around I hear character 0 read multiple times but not replacement, preserve, and level. Just value. I think you should get rid of the value column altogether and just make it look something like this. Although keep your row headers as row headers, GFM just doesn't let me make row headers.
Attribute Replacement Level Preserve
Symbol description space character never
Symbol description in user file (en_US) [Not defined] [Not defined] [Not defined]
Symbol description in English file space character [Not defined]
Symbol description in English CLDR file [Not defined] [Not defined] [Not defined]

Good catch for the headers reading.

Not sure to understand your suggestion though. For a multi-compound character such as a̋̇, do you think that the paragraph "Symbol description in NVDA" should contain 3 table (one per character) instead of a big table? This makes sense, but again, I do not know if it was your suggestion.

As an additional question: did you succeed in running scons command?

@ultrasound1372
Copy link
Author

ultrasound1372 commented Jan 30, 2023

No my suggestion was just to fix the reading of the headers, and possibly get rid of the Value column span and simply duplicate the replacement/level/preserve columns if necessary. I don't think they should be split into three separate tables, I like the way this is done as one big table. Might make it so that if nothing is defined in that source, such as the user symbols dictionary, you just don't include that row, but that would be all I'd think. I'm unsure if having a row for the description of the entire combined character would make sense, it would probably only come into play for emojis.
And yes I was able to run scons just fine.

@CyrilleB79
Copy link
Owner

Hi @ultrasound1372 ,

You can check last work in 46f5369 (latest master). I have simplified the symbol description table. And I have got rid of the double header row to get less verbosity.

You also write:

Might make it so that if nothing is defined in that source, such as the user symbols dictionary, you just don't include that row, but that would be all I'd think.

For the emoji ZWJ sequence such as 👨‍🎤:

  • It is seen by NVDA as one character in some situation, e.g. GitHub's new comment field in Chrome browser, and as many characters in other situations (e.g. Notepad).
  • Have you opened an issue in NVDA for this character seen as many? Could you link it here? If not yet, could you open it?

I'd prefer to keep these rows present so that the user knows where NVDA is looking at to get the symbol/character description. Also, it allows me to check that the add-on looks at the correct location and that I have not forgotten any location.
Anyway, this request is going a bit off-topic with respect of this issue. You may comment in #5 if suitable or open a new issue if you want to discuss this topic further.

Also you write:

I'm unsure if having a row for the description of the entire combined character would make sense, it would probably only come into play for emojis.

This should be discussed separately in a new issue when the multi-char problem of NVDA is solved.

@ultrasound1372
Copy link
Author

The NVDA multi-character thing is likely an implementation detail of the different accessibility frameworks. But yes, that commit looks good, and in regards to this issue I think we're done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants