word2vec-cbow-keras-cntk

The implementation of word2vec cbow in Keras using CNTK backend.

Overview

This is the implementation of word2vec cbow and data converter for training data.

The wikipedia data converter

To convert Japanese wikipedia text file to training data that is input to the cbow model. The Japanese Wikipedia text data is https://dumps.wikimedia.org/jawiki/latest/
The word2vec cbow model

Training and saving the model with Keras using CNTK backend. The model format is ONNX.
Test

Try to analogize using the learned model.
- cosine similarity
- most similar
- analogy

Requirements

I developed in the following environment.

Software	version
Ubuntu	16.04.6 LTS
Python	3.6.8
Keras	2.2.4
MeCab	0.996
mecab-python3	0.996.2
CNTK	2.7
CUDA	10.1

How to use

Data Converter

download the japanese wikipedia full data and convert to text from xml by wp2txt and so on.
edit setting file "train_data_gen_settings.py"

execute "gen_prepared_data_multi.py"

$ cd wikipedia_data_converter
$ python gen_prepared_data_multi.py

Training and Saving

edit setting file "training_settings.py"

execute "cbow_train_onnx.py"

$ cd word2vec_keras_cntk
$ python cbow_train_onnx.py

Test

edit setting file "training_settings.py"

execute "cbow_eval.py"

$ cd word2vec_keras_cntk
$ python cbow_eval.py

Thanks

I refered the following project. Thank you.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
wikipedia_data_converter		wikipedia_data_converter
word2vec_keras_cntk		word2vec_keras_cntk
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

word2vec-cbow-keras-cntk

Overview

Requirements

How to use

Data Converter

Training and Saving

Test

Thanks

About

Releases

Packages

Languages

t2hk/word2vec_keras_cntk

Folders and files

Latest commit

History

Repository files navigation

word2vec-cbow-keras-cntk

Overview

Requirements

How to use

Data Converter

Training and Saving

Test

Thanks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages