Skip to content

smorzhov/hour_of_code_2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CommentsСlassifier

Description

Prerequisites

You will need the following things properly installed on your computer.

Installation

  • git clone https://github.com/smorzhov/hour_of_code_2019.git

Running

  1. Download pretrained glove.840B.300d model (2.03 GB). Unzip it into src/data directory.

  2. If you plan to use nvidia-docker, you need to build nvidia-docker image first. Otherwise, you can skip this step

    nvidia-docker build -t sm_keras_tf_py3:gpu .

    Run container (run this command in the same directory where Dockerfile is)

    nvidia-docker run --user $(id -u):$(id -g) -dt --name sm_hoc -m 50GB -v $(pwd)/src:/$(basename $(pwd)) -w /$(basename $(pwd)) sm_keras_tf_py3:gpu /bin/bash
  3. Cleaning dataset

    nvidia-docker exec --env CUDA_VISIBLE_DEVICES='0' sm_hoc python3 -u nlp.py prepare-data
  4. Training

    By default, only the 0th GPU is visible for the docker container. You can change this by passing --env option to exec. For example:

    nvidia-docker exec --env CUDA_VISIBLE_DEVICES='0' sm_hoc python3 -u nlp.py train --data-path ./processed_data 

Advices

You can add some custom stop words. They must be placed in ~src/data/stopwords.txt file (one word per line).