|
| 1 | +--- |
| 2 | +title: Basemap Localization |
| 3 | +outline: deep |
| 4 | +--- |
| 5 | +<script setup> |
| 6 | + import MaplibreMap from '../components/MaplibreMap.vue' |
| 7 | +</script> |
| 8 | + |
| 9 | +# Basemap Localization |
| 10 | + |
| 11 | +Protomaps has several localization options for names used in text labels. |
| 12 | + |
| 13 | +<MaplibreMap/> |
| 14 | + |
| 15 | +## Default `name` value |
| 16 | + |
| 17 | +Protomaps follows OpenStreetMaps's convention where a features's primary name value is is the most common name in the local language(s). |
| 18 | + |
| 19 | +In practice, this is most often a single name value like: |
| 20 | + |
| 21 | +- `London` the locality is represented as a simple key, value pair: `name` = `London` |
| 22 | + |
| 23 | +However, many places have more than one common local languages and Protomaps passes thru OpenStreetMap's convention of concatenating multiple names with a `/` deliminator into a single name value, like: |
| 24 | + |
| 25 | +- `Switzerland` the country is represented as a complex key, value pair: `name` = `Schweiz/Suisse/Svizzera/Svizra` |
| 26 | + |
| 27 | +For transnational places involving many countries and languages, like `sea` features, the default name value can get quite long and unweidly! |
| 28 | + |
| 29 | +However, we recommended preferring localized names (see blow) for map labels, and fallback to the default name only when a localized name isn't available. |
| 30 | + |
| 31 | +## Localized `name:*` values |
| 32 | + |
| 33 | +Protomaps structures localized names using the same `name:{language_code}` formatting as OpenStreetMap. |
| 34 | + |
| 35 | +More than 100 countries recognize 2 or more official languages – and some like Bolivia, India, and South Africa recognize more than 10 official languages each! |
| 36 | + |
| 37 | +A single official language is used in most remaining countries. There are a few countries where no official language has been designated – like in the United States. |
| 38 | + |
| 39 | +Going back to our London example, English is the predominant (unofficial) langauge in the United Kingdom: |
| 40 | + |
| 41 | +- `name:en` = `London` |
| 42 | + |
| 43 | +Extending our London example, many other languages include [exonym and endonym](https://simple.wikipedia.org/wiki/Exonym_and_endonym#:~:text=An%20exonym%20is%20a%20name,place%20and%20language%20call%20themselves.) values in both Latin script and non-Latin scripts: |
| 44 | + |
| 45 | +- `name:ar` = `لندن` |
| 46 | +- `name:de` = `London` |
| 47 | +- `name:es` = `Londres` |
| 48 | +- `name:fr` = `Londres` |
| 49 | +- `name:it` = `Londra` |
| 50 | +- `name:pt` = `Londres` |
| 51 | +- `name:zh-Hans` = `伦敦` |
| 52 | +- `name:zh-Hant` = `倫敦` |
| 53 | +- _... many other localized values..._ |
| 54 | + |
| 55 | +Going back to our Switzerland example, each of the local (often official) languages would have a specific language name value (in this case German `de`, French `fr`, Italian `it`, and Romansh `rm`), like: |
| 56 | + |
| 57 | +- `name:de` = `Schweiz` |
| 58 | +- `name:fr` = `Suisse` |
| 59 | +- `name:it` = `Svizzera` |
| 60 | +- `name:rm` = `Svizra` |
| 61 | +- _... many other localized values..._ |
| 62 | + |
| 63 | +Extending our Switzerland example with exonym and endonym from other languages: |
| 64 | + |
| 65 | +- `name:ar` = `سويسرا` |
| 66 | +- `name:en` = `Switzerland` |
| 67 | +- `name:es` = `Switzerland` |
| 68 | +- `name:pt` = `Suíça` |
| 69 | +- `name:zh` = `瑞士` |
| 70 | +- `name:zh-Hans` = `瑞士` |
| 71 | +- `name:zh-Hant` = `瑞士` |
| 72 | +- _... many other localized values..._ |
| 73 | + |
| 74 | +_NOTE: The Chinese (`zh`) examples above demonstrates how a single language can have multiple writing systems, in this case both simplified Chinese (`zh-Hans`) used in mainland China and tranditional Chinese (`zh-Hant`) used in Taiwan. The value stored in `zh` could be either of those._ |
| 75 | + |
| 76 | +## Script of default `name` value |
| 77 | + |
| 78 | +The default (or primary) `name` does not self describe the writing system "script" or character set (alphabetic, stroke-based, or otherwise) used to render the value. When combining with localized `name:*` values. This complicates preferring to "fallback" to another language in the same script family before falling back to characters using a different writing system the reader may not be able to make sense of. |
| 79 | + |
| 80 | +To help solve this, Protomaps characterizes the scipt used in the default `name` value by adding a `pmap:script` tag. |
| 81 | + |
| 82 | +Values in `pmap:script` follow the [ISO 15924](https://unicode.org/iso15924/iso15924-codes.html) standard codes for the representation of names of scripts and are summarized in the table below. |
| 83 | + |
| 84 | +_NOTE: Some languages can be written in more than one script, e.g., Malay can be written in Latin, Arabic, and Thai._ |
| 85 | + |
| 86 | +## Common languages, their codes, and scripts |
| 87 | + |
| 88 | +This table summarizes 26 common langauges, their ISO codes, and writing system scripts. |
| 89 | + |
| 90 | +| Language | Native name | `name:*` property | [ISO 639-2 code](https://en.wikipedia.org/wiki/List_of_ISO_639-2_codes) | [ISO_639-1 code](https://en.wikipedia.org/wiki/ISO_639-1) | [ISO_15924 script(s)](https://unicode.org/iso15924/iso15924-codes.html) | |
| 91 | +|--------|-----------------|-----------|-----|----|----| |
| 92 | +| Arabic | اَلْعَرَبِيَّةُ | `name:ar` | ara | ar | `Arabic` | |
| 93 | +| Bengali | বাংলা | `name:bn` | ben | bn | `Bengali` | |
| 94 | +| German | Deutsch | `name:de` | deu | de | `Latin` | |
| 95 | +| English | English | `name:en` | eng | en | `Latin` | |
| 96 | +| Spanish | español | `name:es` | spa | es | `Latin` | |
| 97 | +| Farsi | فارسی | `name:fa` | fas | fa | `Arabic` | |
| 98 | +| French | français | `name:fr` | fra | fr | `Latin` | |
| 99 | +| Greek | Νέα Ελληνικά | `name:el` | ell | el | `Greek` | |
| 100 | +| Hebrew | עברית | `name:he` | heb | he | `Hebrew` | |
| 101 | +| Hindi | हिन्दी | `name:hi` | hin | hi | `Devanagari` | |
| 102 | +| Hungarian | magyar | `name:hu` | hun | hu | `Latin` | |
| 103 | +| Indonesian | bahasa Indonesia | `name:id` | ind | id | `Latin` | |
| 104 | +| Italian | italiano | `name:it` | ita | it | `Latin` | |
| 105 | +| Japanese | 日本語 | `name:ja` | jpn | ja | `Han`, `Katakana`, `Hiragana` | |
| 106 | +| Korean | 한국어 | `name:ko` | kor | ko | `Hangul` | |
| 107 | +| Dutch | Nederlands | `name:nl` | nld | nl | `Latin` | |
| 108 | +| Polish | Język polski | `name:pl` | pol | pl | `Latin` | |
| 109 | +| Portuguese | português | `name:pt` | por | pt | `Latin` | |
| 110 | +| Russian | русский язык | `name:ru` | rus | ru | `Cyrillic` | |
| 111 | +| Swedish | svenska | `name:sv` | swe | sv | `Latin` | |
| 112 | +| Turkish | Türkçe | `name:tr` | tur | tr | `Latin` | |
| 113 | +| Ukrainian | Українська мова | `name:uk` | ukr | uk | `Cyrillic`, `Latin` | |
| 114 | +| Urdu | اُردُو | `name:ur` | urd | ur | `Arabic` | |
| 115 | +| Vietnamese | Tiếng Việt | `name:vi` | vie | vi | `Latin` | |
| 116 | +| Chinese simplified | 中文 汉语 | `name:zh-Hans` | zho | zh | `Han` | |
| 117 | +| Chinese traditional | 中文 漢語 | `name:zh-Hant` | zho | zh | `Han` | |
| 118 | + |
| 119 | +A full 2-character language code decoder ring is |
| 120 | +[available](https://en.wikipedia.org/wiki/List_of_ISO_639-2_codes). |
| 121 | + |
| 122 | +_NOTE: Some langauges require codes with 3-characers or more._ |
| 123 | + |
| 124 | +## Common languages by country |
| 125 | + |
| 126 | +The following country and international organizations worldviews are supported: |
| 127 | + |
| 128 | +| Country | Native name | Common language | Localized `name:*` value | Recommended `name:*` pairing | |
| 129 | +|---------|-------------|------------------|--------------------------|--------------------------| |
| 130 | +| Argentina | Argentina | Spanish | `name:es` | `name:it`, `name:fr`, `name:en`, `name:de` | |
| 131 | +| Bangladesh | বাংলাদেশ | Bengali | `name:bn` | _n/a_ | |
| 132 | +| Brazil | Brasil | Portugese | `name:pt` | `name:es`, `name:it`, `name:fr`, `name:en`, `name:de` | |
| 133 | +| China | 中国 | Chinese | `name:zh-Hans` | `name:zh`, `name:zh-Hant` | |
| 134 | +| Egypt | مصر | Arabic | `name:ar` | `name:fr`, `name:en`, `name:de` | |
| 135 | +| France | France | French | `name:fr` | `name:es`, `name:it`, `name:pt`, `name:en`, `name:de` | |
| 136 | +| Germany | Deutschland | German | `name:de` | `name:en`, `name:fr`, `name:es`, `name:it` | |
| 137 | +| Greece | Ελλάς | Greek | `name:el` | _n/a_ | |
| 138 | +| India | भारत | Hindi and many other | `name:hi`, +++ | `name:en` | |
| 139 | +| Indonesia | Indonesia | Indonesian | `name:id` | | |
| 140 | +| Israel | ישראל | Hebrew | `name:he` | _n/a_ | |
| 141 | +| Italy | Italia | Italian | `name:it` | `name:es`, `name:fr`, `name:pt`, `name:en`, `name:de` | |
| 142 | +| Japan | 日本 | Japanese | `name:ja` | _n/a_ | |
| 143 | +| Morocco | المغرب | Arabic | `name:ar` | `name:fr`, `name:en`, `name:de` | |
| 144 | +| Nepal | नेपाल | Nepalese | `name:ne` | `name:en`| |
| 145 | +| Netherlands | Nederland | Dutch | `name:nl` | `name:en`, `name:de`, `name:fr`, `name:es`, `name:it` | |
| 146 | +| Pakistan | پاکستان | Urdu | `name:ur` | _n/a_ | |
| 147 | +| Palestine | فلسطين | Arabic | `name:ar` | _n/a_ | |
| 148 | +| Poland | Polska | Polish | `name:pl` | `name:de`, `name:en` | |
| 149 | +| Portugal | Portugal | Portugese | `name:pt` | `name:es`, `name:it`, `name:fr`, `name:en`, `name:de` | |
| 150 | +| Russia | Россия | Russian | `name:ru` | _n/a_ | |
| 151 | +| Saudi Arabia | المملكة العربية السعودية | Arabic | `name:ar` | _n/a_ | |
| 152 | +| South Korea | 한국 | Korean | `name:ko` | _n/a_ | |
| 153 | +| Spain | España | Spanish | `name:es` | `name:pt`, `name:it`, `name:fr`, `name:en`, `name:de` | |
| 154 | +| Sweden | Sverige | Swedish | `name:sv` | `name:en` | |
| 155 | +| Taiwan | 中華民國 | Traditional Chinese | `name:zh-Hant` | `name:zh-Hans`, `name:zh`| |
| 156 | +| Turkey | Türkiye | Turkish | `name:tr` | `name:fr`, `name:en`, `name:de` | |
| 157 | +| Ukraine | Україна | Ukrainian | `name:uk` | `name:ru` | |
| 158 | +| United Kingdom | United Kingdom | English, Welsh, Scottish, Irish, others | `name:en` | `name:es`, `name:fr`, `name:en`, `name:de` | |
| 159 | +| United States | United States | English, Spanish, French, others | `name:en` | `name:es`, `name:fr`, `name:en`, `name:de` | |
| 160 | +| Vietnam | Việt Nam | Vietnamese | `name:vi` | `name:fr`, `name:en`, `name:es`, `name:de` | |
| 161 | + |
| 162 | +## Positioned glyph font `pmap:pgf:name:*` values |
| 163 | + |
| 164 | +Protomaps adds additional names for a small set of language scripts, currently just the `Devanagari` script used for Hindi (`name:hi` and `pmap:pgf:name:hi`) and related languages. |
| 165 | + |
| 166 | +Rendering text in web browsers works for almost all languages and scripts and feels like magic. However, specialized map renderers like MapLibre have to reimplement text rendering and text layout which is complicated when text needs to be curved along linear map features instead of placed only horizontally or vertically. MapLibre normally assumes a one-to-one mapping between glyphs and Unicode codepoints that also exist in MapLibre font files (aka "font stacks") to accomplish the layout for a large but limited number of scripts. Plugins have been developed to extend MapLibre for **right-to-left** scripts like Arabic and Hebrew, and MapLibre has built-in support for **CJK scripts** like Chinese, Japanese, and Korean. |
| 167 | + |
| 168 | +To facilitate Protomap's support of additional, non-supported scripts in MapLibre (like the Devanagari script used by the Hindi language), Protomaps exports names with "positioned glphys" so MapLibre can use codepoints as indices of positioned glyphs in an additional custom "font stack". While the raw `pmap:pgf:name:*` values look like giberish when inspecting the raw values, they render correctly in MapLibre to the end user. |
| 169 | + |
| 170 | +See more: |
| 171 | + |
| 172 | +- [Traditional MapLibre Text Rendering](https://oliverwipfli.ch/about-text-rendering-in-maplibre-2023-10-17/) |
| 173 | +- [Devanagari Positioned Glyph Fonts](https://oliverwipfli.ch/devanagari-in-the-protomaps-basemap-with-a-positioned-glyph-font-for-maplibre-2024-06-30/) |
| 174 | + |
| 175 | +## Styling localized name |
| 176 | + |
| 177 | +Labeling a map is typically localized for a specific language audience by prefering a specific name tag and falling back to similar languages (in the same writing system "script", see above), and finally falling back to the feature's default name (which could be in any script, in any language). |
| 178 | + |
| 179 | +### MapLibre |
| 180 | + |
| 181 | +#### MapLibre styling basic example |
| 182 | + |
| 183 | +TK TK TK |
| 184 | + |
| 185 | +#### MapLibre styling localized name with fallback example |
| 186 | + |
| 187 | +TK TK TK |
| 188 | + |
| 189 | +#### MapLibre styling localized name with script-based fallback example |
| 190 | + |
| 191 | +TK TK TK |
| 192 | + |
| 193 | +#### MapLibre styling positioned glyph font with script-based example |
| 194 | + |
| 195 | +TK TK TK |
| 196 | + |
| 197 | +#### MapLibre supported scripts and languages |
| 198 | + |
| 199 | +| Script | Languages | |
| 200 | +| ------- | ---------| |
| 201 | +| `Latin` | AFRIKAANS, ALBANIAN, AZERBAIJANI (also `Cyrillic`, `Arabic`), BASQUE, BOSNIAN (also `Cyrillic`), , CATALAN, CROATIAN, CZECH, DANISH, DUTCH, ENGLISH, ENGLISH (AUSTRALIAN), ENGLISH (GREAT BRITAIN), ESTONIAN, FINNISH, FILIPINO, FRENCH, FRENCH (CANADA), GALICIAN, GERMAN, HUNGARIAN, ICELANDIC, INDONESIAN, ITALIAN, KAZAKH (also `Latin`, `Arabic`, `Cyrillic`), LATVIAN, LITHUANIAN, MALAY (also `Arabic`, `Thai`), NORWEGIAN, POLISH, PORTUGUESE, PORTUGUESE (BRAZIL), PORTUGUESE (PORTUGAL), ROMANIAN, SERBIAN (also `Cyrillic`), SLOVAK (also `Cyrillic`), SLOVENIAN, SPANISH, SPANISH (LATIN AMERICA), SWAHILI, SWEDISH, TURKISH, UZBEK (also `Cyrillic`, `Arabic`), VIETNAMESE, ZULU | |
| 202 | +| `Arabic` | ARABIC, FARSI, URDU, KAZAKH (also `Cyrillic`, `Latin`), KYRGYZ (also `Cyrillic`) | |
| 203 | +| `Cyrillic` | BELARUSIAN, BULGARIAN (also `Latin`), KAZAKH (also `Latin`, `Arabic`), KYRGYZ (also `Arabic`), MACEDONIAN, MONGOLIAN, RUSSIAN, SERBIAN (also `Latin`), UKRAINIAN | |
| 204 | +| `Han` | CHINESE, CHINESE (SIMPLIFIED), CHINESE (HONG KONG), CHINESE (TRADITIONAL) | |
| 205 | +| `Amharic` | AMHARIC | |
| 206 | +| `Armenian` | ARMENIAN | |
| 207 | +| `Hangul` | KOREAN | |
| 208 | +| `Hebrew` | HEBREW | |
| 209 | +| `Japanese` | JAPANESE | |
| 210 | +| `Georgian` | GEORGIAN | |
| 211 | +| `Greek` | GREEK | |
| 212 | +| `Mongolian` | MONGOLIAN (also `Cyrillic`) | |
| 213 | + |
| 214 | +NOTE: Right-to-left scripts and languages like Arabic and Hebrew requires a special [RTL text MapLibre plugin](https://maplibre.org/maplibre-gl-js/docs/examples/mapbox-gl-rtl-text/). |
| 215 | + |
| 216 | +#### MapLibre partial support |
| 217 | + |
| 218 | +Requires paired positioned glyph font [font stack](https://maplibre.org/maplibre-style-spec/glyphs/) paired with `pmap:pgf:name:*` values. The PGF fontstacks used by the Protomaps basemaps is available at https://github.com/protomaps/basemaps-assets/tree/main/fonts. |
| 219 | + |
| 220 | +| Script | Languages | |
| 221 | +| ------- | ---------| |
| 222 | +| `Devanagari` | GUJARATI, HINDI, MARATHI, NEPALI | |
| 223 | + |
| 224 | +These are primarily found in India. |
| 225 | + |
| 226 | +#### MapLibre no support |
| 227 | + |
| 228 | +| Script | Languages | |
| 229 | +| ------- | ---------| |
| 230 | +| `Kannada` | KANNADA | |
| 231 | +| `Bengali` | BENGALI | |
| 232 | +| `Burmese` | BURMESE | |
| 233 | +| `Khmer` | KHMER | |
| 234 | +| `Lao` | LAO | |
| 235 | +| `Malayalam` | MALAYALAM | |
| 236 | +| `Punjabi` | PUNJABI | |
| 237 | +| `Sinhalese` | SINHALESE | |
| 238 | +| `Tamil` | TAMIL | |
| 239 | +| `Telugu` | TELUGU | |
| 240 | +| `Thai` | THAI |
| 241 | + |
| 242 | +_NOTE: This is a partial listing of scripts and languages._ |
| 243 | + |
| 244 | +These non-supported MapLibre languages are primarily found in India and countries in south-east Asia. |
| 245 | + |
| 246 | +### OpenLayers |
| 247 | + |
| 248 | +Tk tk tk |
0 commit comments