Skip to content

Turkish WebVectors: Tool to explore semantic relations between words in distributional models.

License

Notifications You must be signed in to change notification settings

Turkish-Word-Embeddings/Turkish-WebVectors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Turkish-WebVectors

Turkish WebVectors: Tool to explore semantic relations between words in distributional models. Based on WebVectors prepared by Kutuzov et al. [1].

Installation

  • First, clone the repository from GitHub to local machine using the following command:
git clone https://github.com/akutuzov/webvectors.git .
  • Then, create a virtual environment using the following command:
python -m venv env
  • Then, install all the Python requirements using the following command:
pip install -r requirements.txt
  • Gunicorn is a pure Python WSGI server with simple configuration and multiple worker implementations for performance tuning. It is used for production deployment. Install gunicorn as follows:
pip install gunicorn

How to run using Docker?

  • First, clone the repository from GitHub to local machine using the following command:
git clone https://github.com/Turkish-Word-Embeddings/Turkish-WebVectors.git . 
  • Launch a terminal in the root directory of the repo and build the Docker image where -t is the tag for the Docker image. You can provide any name you want. . is the relative path to the Dockerfile
docker build -t turkish-webvectors .
  • Dockerfile downloads the necessary two models from our release page. You can also download them manually and put them in the models directory.
  • Run the Docker image using the following command:
docker run -p 8000:8000 turkish-webvectors

Models

Both of the following models are trained with gensim using 300 dimensions, 10 epochs, 5 negative samples, minimum word count set to 10 and window size set to 5.

  • Word2Vec skip-gram Negative Sampling
  • FastText skip-gram Negative Sampling

  1. Kutuzov, Andrei & Kuzmenko, Elizaveta. (2017). WebVectors: A Toolkit for Building Web Interfaces for Vector Semantic Models. Communications in Computer and Information Science. 661. 155-161. 10.1007/978-3-319-52920-2_15.

About

Turkish WebVectors: Tool to explore semantic relations between words in distributional models.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published