Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support explicit encodings other than UTF-8 #4

Open
ocram opened this issue Mar 20, 2019 · 5 comments
Open

Support explicit encodings other than UTF-8 #4

ocram opened this issue Mar 20, 2019 · 5 comments

Comments

@ocram
Copy link
Contributor

ocram commented Mar 20, 2019

  • ISO-8859-1
    • Afrikaans
    • Albanian
    • Basque
    • Breton
    • Catalan
    • Cornish
    • Danish
    • Dutch
    • English
    • Estonian
    • Faroese
    • Finnish
    • French
    • Galician
    • German
    • Greenlandic
    • Icelandic
    • Indonesian
    • Irish
    • Italian
    • Malay
    • Manx
    • Norwegian
    • Occitan
    • Portuguese
    • Spanish
    • Swedish
    • Tagalog
    • Uzbek
    • Walloon
  • ISO-8859-2
    • Bosnian
    • Croatian
    • Czech
    • Hungarian
    • Polish
    • Romanian
    • Serbian
    • Slovak
    • Slovenian
  • ISO-8859-3
    • Maltese
  • ISO-8859-5
    • Macedonian
    • Serbian
  • ISO-8859-6
    • Arabic
  • ISO-8859-7
    • Greek
  • ISO-8859-8
    • Hebrew
  • ISO-8859-9
    • Turkish
  • ISO-8859-13
    • Latvian
    • Lithuanian
    • Maori
  • ISO-8859-14
    • Welsh
  • ISO-8859-15
    • Basque
    • Catalan
    • Dutch
    • English
    • Finnish
    • French
    • Galician
    • German
    • Irish
    • Italian
    • Portuguese
    • Spanish
    • Swedish
    • Walloon
  • KOI8-R
    • Russian
  • KOI8-U
    • Ukrainian
  • KOI8-T
    • Tajik
  • CP1251
    • Bulgarian
    • Belarusian
  • GB2312 / GBK / GB18030
    • Chinese (Simplified)
  • BIG5 / BIG5-HKSCS
    • Chinese (Traditional)
  • EUC-JP
    • Japanese
  • EUC-KR
    • Korean
  • TIS-620
    • Thai
  • GEORGIAN-PS
    • Georgian

Source: https://www.gnu.org/software/gettext/manual/html_node/Header-Entry.html

@LS05
Copy link

LS05 commented Jun 11, 2019

Can I work on this issue?

@ocram
Copy link
Contributor Author

ocram commented Jun 12, 2019

Do you know what needs to be done?

In general, we like to talk and discuss concepts and details before building the implementation, and also during the process, to avoid implementations that go in the wrong direction or miss critical details.

@LS05
Copy link

LS05 commented Jun 17, 2019

Hey @ocram I will first understand what needs to be done, and then come up with a plan in the coming days!

@LS05
Copy link

LS05 commented Jun 25, 2019

I have searched for "UTF8" in the repository and this shell script i18n.sh and this class I18n have a hardcoded dependency on UTF8.

Am I missing something else? Am I on the right track?

@ocram
Copy link
Contributor Author

ocram commented Jun 26, 2019

You’re right, in both of these files, some changes and generalization are necessary.

The shell script would need to accept an optional encoding supplied as a parameter, in a format that works for the Gettext utilities.

In the PHP code, we don’t want to try all those possible encodings for every locale that is specified, so it should definitely first detect which encodings are relevant for each specific locale in question (see the list above). The detected set of relevant encodings should be tried only after UTF-8.

Finally, whatever encoding is used, this must be compatible with PHP’s current internal encoding (and the encoding of any output), whichever those may be.

By the way, I’m not sure how urgent all this is, because most applications should be using UTF-8 now, especially newer ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants