A chat system where the responses are generated based on the input text
In this project we applied and experimented with the current state-of-art NLP techniques to try and solve this problem. We used Recurrent Neural Networks (RNNs), specifically sequence to sequence (seq2seq) deep learning methodology.
Chat corpus repo (https://github.com/Marsan-Ma/chat_corpus)
- Preprocessing the data
- Building the models
- Testing the models
- Word-based: word_seq2seq.py
- Character-based: Character_seq2seq.py
- Word-based with embedding: word_embedding_seq2seq.py
[1] Li, et al. (2016). A persona-based neural conversation model. Association for Computational Linguistics. 2016.
[2] Zhou, et al. (2016). Answer sequence learning with neural networks for answer selection in community question answering. Association for Computational Linguistics. 2015.
[3] Pilato, et al. (2011). A modular architecture for adaptive chatbots. Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on. IEEE, 2011
[4] Deep Learning for Chatbots: Encoder-Decoder Image Retrieved from: http://www.wildml.com/2016/04/deep-learning-for-chatbots-part-1-introduction/
[5] Cho, et al. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Association for Computational Linguistics. 2014.
[6] Marsan Ma. Twitter_scraper. https://github.com/Marsan-Ma/chat_corpus Web site.
[7] Sequence-to-sequence learning in Keras Retrieved from: https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html
[8] Bahdanau D, Cho K, Bengio Y (2014). Neural machine translation by jointly learning to align and translate, ICLR. 2015.