This repository makes available parallelized interwiki link data from Wikidata, which can be used, among other things, for the purpose of creating large offline English dictionaries.
The data directory contains subdirectories arranged in order of ISO language code. The target language in each case is en
(English).
The basic filename pattern is [ISO]-en_wiki.txt
, with [ISO]
being the source language ISO code. A list of all language pairs is below.
Language codes | Language names |
---|---|
af-en |
Afrikaans => English |
am-en |
Amharic => English |
ang-en |
Anglo-Saxon => English |
ar-en |
Arabic => English |
arc-en |
Aramaic => English |
bg-en |
Bulgarian => English |
bi-en |
Bislama => English |
bn-en |
Bengali => English |
bo-en |
Tibetan => English |
br-en |
Breton => English |
bs-en |
Bosnian => English |
ca-en |
Catalan => English |
cdo-en |
Min Dong => English |
chr-en |
Cherokee => English |
chy-en |
Cheyenne => English |
cr-en |
Cree => English |
cs-en |
Czech => English |
cy-en |
Welsh => English |
da-en |
Danish => English |
de-en |
German => English |
el-en |
Greek => English |
eo-en |
Esperanto => English |
es-en |
Spanish => English |
et-en |
Estonian => English |
eu-en |
Basque => English |
fa-en |
Persian => English |
ff-en |
Fula => English |
fi-en |
Finnish => English |
fr-en |
French => English |
ga-en |
Irish => English |
gan-en |
Gan => English |
gd-en |
Scottish Gaelic => English |
gu-en |
Gujarati => English |
gv-en |
Manx => English |
ha-en |
Hausa => English |
hak-en |
Hakka => English |
haw-en |
Hawaiian => English |
he-en |
Hebrew => English |
hi-en |
Hindi => English |
hr-en |
Croatian => English |
ht-en |
Haitian => English |
hu-en |
Hungarian => English |
hy-en |
Armenian => English |
id-en |
Indonesian => English |
ig-en |
Igbo => English |
is-en |
Icelandic => English |
it-en |
Italian => English |
iu-en |
Inuktitut => English |
ja-en |
Japanese => English |
jbo-en |
Lojban => English |
jv-en |
Javanese => English |
ka-en |
Georgian => English |
kg-en |
Kongo => English |
ki-en |
Kikuyu => English |
kl-en |
Greenlandic => English |
km-en |
Khmer => English |
ko-en |
Korean => English |
la-en |
Latin => English |
lg-en |
Luganda => English |
lo-en |
Lao => English |
lt-en |
Lithuanian => English |
lv-en |
Latvian => English |
mg-en |
Malagasy => English |
mi-en |
Maori => English |
mn-en |
Mongolian => English |
ms-en |
Malay => English |
mt-en |
Maltese => English |
nah-en |
Nahuatl => English |
ne-en |
Nepali => English |
nl-en |
Dutch => English |
nn-en |
Norwegian (Nynorsk) => English |
no-en |
Norwegian => English |
nv-en |
Navajo => English |
ny-en |
Chichewa => English |
oc-en |
Occitan => English |
pa-en |
Punjabi => English |
pi-en |
Pali => English |
pl-en |
Polish => English |
ps-en |
Pashto => English |
pt-en |
Portuguese => English |
qu-en |
Quechua => English |
ro-en |
Romanian => English |
ru-en |
Russian => English |
sa-en |
Sanskrit => English |
se-en |
Northern Sami => English |
sh-en |
Serbo-Croatian => English |
sk-en |
Slovak => English |
sl-en |
Slovenian => English |
sn-en |
Shona => English |
so-en |
Somali => English |
sq-en |
Albanian => English |
sr-en |
Serbian => English |
sv-en |
Swedish => English |
sw-en |
Kiswahili => English |
ta-en |
Tamil => English |
te-en |
Telugu => English |
th-en |
Thai => English |
tl-en |
Tagalog => English |
tpi-en |
Tok Pisin => English |
tr-en |
Turkish => English |
ug-en |
Uyghur => English |
uk-en |
Ukrainian => English |
ur-en |
Urdu => English |
vi-en |
Vietnamese => English |
wo-en |
Wolof => English |
wuu-en |
Wu => English |
xh-en |
Xhosa => English |
yi-en |
Yiddish => English |
yo-en |
Yoruba => English |
za-en |
Zhuang => English |
zh-en |
Chinese (Mandarin) => English |
zh_classical-en |
Classical Chinese => English |
zh_min_nan-en |
Min Nan => English |
zh_yue-en |
Cantonese => English |
zu-en |
Zulu => English |
Language pair | # of entries |
---|---|
af-en |
29951 |
am-en |
6308 |
ang-en |
2615 |
ar-en |
214043 |
arc-en |
1378 |
bg-en |
140193 |
bi-en |
479 |
bn-en |
30566 |
bo-en |
2856 |
br-en |
44870 |
bs-en |
33163 |
ca-en |
309248 |
cdo-en |
2217 |
chr-en |
486 |
chy-en |
705 |
cr-en |
111 |
cs-en |
209111 |
cy-en |
46719 |
da-en |
134272 |
de-en |
907990 |
el-en |
75213 |
eo-en |
159061 |
es-en |
742633 |
et-en |
79349 |
eu-en |
152904 |
fa-en |
348197 |
ff-en |
208 |
fi-en |
258347 |
fr-en |
1010365 |
ga-en |
29978 |
gan-en |
5087 |
gd-en |
13833 |
gu-en |
5445 |
gv-en |
4592 |
ha-en |
511 |
hak-en |
3349 |
haw-en |
1931 |
he-en |
128257 |
hi-en |
40066 |
hr-en |
98143 |
ht-en |
30983 |
hu-en |
192031 |
hy-en |
67842 |
id-en |
161262 |
ig-en |
816 |
is-en |
26241 |
it-en |
795898 |
iu-en |
366 |
ja-en |
420717 |
jbo-en |
1170 |
jv-en |
20532 |
ka-en |
63784 |
kg-en |
840 |
ki-en |
309 |
kl-en |
1605 |
km-en |
2361 |
ko-en |
193308 |
la-en |
102756 |
lg-en |
178 |
lo-en |
1220 |
lt-en |
94850 |
lv-en |
45109 |
mg-en |
68386 |
mi-en |
2551 |
mn-en |
12167 |
ms-en |
187732 |
mt-en |
2803 |
nah-en |
7809 |
ne-en |
11448 |
nl-en |
715263 |
nn-en |
94129 |
no-en |
275488 |
nv-en |
2156 |
ny-en |
167 |
oc-en |
80831 |
pa-en |
11694 |
pi-en |
2643 |
pl-en |
691110 |
ps-en |
3741 |
pt-en |
588641 |
qu-en |
15580 |
ro-en |
204776 |
ru-en |
628146 |
sa-en |
5939 |
se-en |
6059 |
sh-en |
189316 |
sk-en |
143133 |
sl-en |
90227 |
sn-en |
1644 |
so-en |
2698 |
sq-en |
31252 |
sr-en |
210431 |
sv-en |
539706 |
sw-en |
23612 |
ta-en |
45860 |
te-en |
14193 |
th-en |
66679 |
tl-en |
48164 |
tpi-en |
1331 |
tr-en |
156837 |
ug-en |
2320 |
uk-en |
314535 |
ur-en |
60176 |
vi-en |
397221 |
wo-en |
956 |
wuu-en |
2850 |
xh-en |
305 |
yi-en |
8508 |
yo-en |
28888 |
za-en |
666 |
zh-en |
435714 |
zh_classical-en |
7165 |
zh_min_nan-en |
11617 |
zh_yue-en |
24066 |
zu-en |
666 |
Language pair | # of entries |
---|---|
fr-en |
1010365 |
de-en |
907990 |
it-en |
795898 |
es-en |
742633 |
nl-en |
715263 |
pl-en |
691110 |
ru-en |
628146 |
pt-en |
588641 |
sv-en |
539706 |
zh-en |
435714 |
According to the Wikidata website:
All structured data from the main and property namespace is available under the Creative Commons CC0 License
The data in this repository is therefore made available under the same Creative Commons CC0 License as that used by the Wikidata project. All of the data has been derived from the Wikidata JSON format database dumps.