-
Install the following additional Lua libs:
luarocks install nn luarocks install rnn luarocks install penlight
To train with CUDA install the latest CUDA drivers, toolkit and run:
luarocks install cutorch luarocks install cunn
To train with opencl install the lastest Opencl torch lib:
luarocks install cltorch luarocks install clnn
-
Download the dgk_lost_conv 中文对白语料 and unzip
results/xiaohuangji50w_fenciA.conv.zip
intodata/
.
th train.lua [-h / options]
Use the --dataset NUMBER
option to control the size of the dataset. Training on the full dataset takes about 5h for a single epoch.
The model will be saved to data/model.t7
after each epoch if it has improved (error decreased).
To load the model and have a conversation:
th eval.lua
##How Use https://github.com/dgkae/dgk_lost_conv as training corpus. The chinese sentenses should be splited by semantic words, using '/'. We modify cornell_movie_dialog.lua to support it. Lua save all string(e.g. chinese) all in multibyte, so in chinese the formal pl.lexer is not working. We use outsider word-splitting tool and using '/' as the tag.
##Result
MIT