This package implements the locally linear word meta embedding learning methods proposed in the following paper. (please cite it if you use the code in your work)
@inproceedings{Bollegala:IJCAI:2018,
author = {Danushka Bollegala and Koheu Hayashi and Ken-ichi Kawarabayashi},
title = {Think Globally, Embed Locally --- Locally Linear Meta-embedding of Words},
booktitle = {Proc. of IJCAI-ECAI},
pages = {3970--3976},
year = {2018}
}
-
Save your pre-trained source word embeddings to the "sources" directory. You could download several pre-trained word embeddings from the following links.
-
Edit ./src/sources.py and specify the paths for those source word embeddings and their dimensionalities. The vocabulary of the words for which meat embeddings will be created is in ./work/selected-words. You can add your own vocabulary by editing this file but make sure that the word emebddings for those words are available in your pre-trained source embeddings. Otherwise, we would assume that word to be missing in the source word embedding.
-
Run
python meta-embed.py --nns [neighbourhood size] --comps [dimensionality of the meta-embeddings]
The output meta embeddings will be written to ./work/meta-embeds