LiveQA submission for TREC-2016

Introduction

This project is based on the TREC-2016 track LiveQA. In the heart of it uses Latent Dirichlet Allocation (LDA) to infer the semantic topics and uses this model to construct a probability distribution for each of the retrieved documents from the knowledge base. Finally the Jensen-Shannon Distance (JSD) is calculated to have a symilarity measure and the most similar answer is selected as the returned answer. The knowledge base used right now is the yahoo answers database.

Leverages on:

Future Work

Add more resources other than YahooAnswers.
Improve query construction when searching for candidate question/answer tuples.
Add more similarity metrics (aggregation, semantic).
Improve NLP processing.
Add multi-document summarization when possible.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
liveqa		liveqa
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
main.py		main.py
main_gensim.py		main_gensim.py
main_lda.py		main_lda.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiveQA submission for TREC-2016

Future Work

References

About

Releases

Packages

Languages

License

xirdneh/liveqa-trec-2016

Folders and files

Latest commit

History

Repository files navigation

LiveQA submission for TREC-2016

Future Work

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages