-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose new function, width(control_codes='ignore') #79
Comments
Just out of curiosity, are there any real-world examples where this enhancement would be beneficial? |
I have used to strike-through on the suggestion of the terminal sequence, I will save for another issue. As for the need for "width" function, just about every downstream library has some issue with the POSIX wcwidth and wcswidth functions. This is mainly because both functions may return -1, and the return value must be checked, but it often is not. And I think all downstream users wish for us to have a single function that makes a "best effort". if a zero width joined emoji sequence also contains a newline or other control character, it is best to just return our best estimate of the measurement rather than -1 as wcswidth() does. wcswidth()Although using
wcwidth()As a workaround, I have suggested to use This provides the same function as wcswidth but provides a "best guess", however, this method cannot handle coming changes to wcswidth to handle zero width joiner (ZWJ) sequences. |
Although I am open to changing wcswidth() to never return -1 and make a "best effort", it would deviate from the original 2007 implementation and POSIX specification, and this is why i suggest an entirely new function name and strongly suggest it is the best alternative in the docstrings of wcswidth and wcwidth |
Thank you for the clarification! |
I have created it in development branch but I will make a bugfix release first, I will make a PR for this next, Lines 262 to 277 in 1f1443b
|
I have revised this description and related issue #92 And I do think they are closely related. control characters like |
Problem
As for the need for "width" function, just about every downstream library has some issue with the POSIX
wcwidth()
andwcswidth()
functions, either in C or in this python library.This is mainly because both functions may return -1, and the return value must be checked, but it often is not.
Although using
wcswidth()
on a string is the most popular use case, it has the possibility to return -1 by POSIX definition, and Markus Kuhn's 2007 implementation returns -1 for control characters.The return value is often unchecked where it is used with sum(), slice() or screen positioning functions with surprising results.
Solution
Provide new function signature,
width
that always returns a "best effort" of measured distance. It may ignore or measure control codes, instead. If "catching unexpected control codes" is a desired function, we can continue to provide it as an optional keyword argument, and, rather than return -1, raise an exception.Maybe new keyword argument
control_codes
with default argument 'ignore', in similar spirit to 'errors' for https://docs.python.org/3/library/stdtypes.html#bytes.decode,Workaround
As a workaround, I have suggested to use
wcwidth()
directly on each individual character and clip the possible -1 return value to 0, example: https://github.com/jquast/blessed/blob/a34c6b1869b4dd467c6d1ab6895872bb72db7e0f/blessed/sequences.py#L364This provides the same function as wcswidth but provides a "best guess", however, this method cannot handle coming changes to wcswidth to handle zero width joiner (ZWJ) sequences.
The text was updated successfully, but these errors were encountered: