fix(js/string_util): u2b(): convert non-Big5 chars to A1BC (□) instead of FFFD (non-Big5) #126
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related issues: ptt/pttbbs#95
FFFD is not a valid Big5-UAO character but a Unicode character. Also, it means
IAC DO <lacking option id>
in Telnet protocol if the FF is not replaced byIAC IAC
(escaped 0xFF).To fix the issue, non-Big5-UAO chars are now converted into A1BC (Big5 '□'). Also, UTF-16 high surrogates are now ignored to make the char count consistent.
A1BC (Big5 '□') has been chosen for the following reasons:
This makes all the following cases convert to a single A1BC (□).
u2b()
※ The choice of using A1BC (□) is subject to changes
Sample Text
previous
u2b()
new
u2b()