Submission for HackJaipur Hackathon 2020.
CoronaXiv is an ElasticSearch-powered AI Search Engine that indexes the thousands of research papers that have piled up in response to the Corona Virus pandemic and illustrates various visualizations.
A lot of researchers are working remotely across the globe, where lockdown restrictions are varying. Some researchers are working in labs, while some are working from home. In order to assist them in their endeavor to help defeat the Corona Virus pandemic, it would be really handy for any researcher to have a dedicated search-engine for Covid-19 papers, and also get links to other similar papers which are AI-recommended, so that it saves their time. Every second is precious in this battle against this global pandemic and hence, we have built CoronaXiv, an ElasticSearch-powered AI Search Engine for research papers related to the Corona Virus. In the current scenario, one would perform Google search in order to look for some research papers. However, more often than not, certain keywords will yield results not related to the Corona Virus pandemic, and also lack UX since the user has to switch from one paper to another every time by going back to the home screen. With CoronaXiv, one can directly access the papers easily, with different visualizations to assist the user in understanding relations of different papers and identify papers based on keywords, or access papers clustered on basis of similar domains.
- AI-powered ElasticSearch based Search Engine for Covid-19 papers.
- Filters on basis of peer review, covid/non-covid, date, etc. is provided.
- Graph plot of related papers shown in clusters, to help identify papers related to each other.
- Additional metadata information provided to help researchers understand the importance of the suggested papers.
- Works as a PWA.
- Mobile-view support.
---
- Provide more comprehensive metadata for each paper.
- Improve search time of results fetched.
- Provide more insights thorugh different graph visualizations.
- Host on AWS/Azure for faster model deployment.
- Suggest more research papers based on the one you are currently reading using LDA.
- Frontend: Vue.js - Frontend can be found here
- Backend: Python3
- Framework: Flask, PyTorch, ElasticSearch, Kibana
- Machine Learning Model: K-means Clustering, Covid-BERT
- Libraries: Available in requirements.txt.
- Fork this Repository.
- Make sure you have Java installed on your computer.
- Download ElasticSearch from here
- Download Kibana from here
- Download Logstash from here
- Unzip all the 3 files downloaded above and add the path for each bin folder in environment variables.
- Download the dataset from here and make sure it is in the root directory.
- Run
python bulk_insert.py
on terminal and wait it till finishes.
- Navigate into frontend.
- Run
npm install
for installing dependencies. - For compiles and hot-reloads for development, run
npm run serve
. - For Compiles and minifies for production, run
npm run build
- For Lints and fixes files, run
npm run lint
- Copy .env-example as .env
- Update .env Environment Variables
- Create a virtual environment and activate the environment.
- Change into he directory in the terminal and run as:
-
pip install -r requirements.txt
-flask run
- Make sure you're running elasticsearch while running the flask server.
- Open your web browser and enter the following URL:
localhost:5000
- Fork this Repository.
- Clone your Fork on a different branch:
git clone -b <name-of-branch> https://github.com/arghyadeep99/CoronaXiv.git
- After adding any feature:
- Goto your fork and create a pull request.
- We will test your modifications and merge changes.
This project was done as a part of HackJaipur Hackathon 2020 remotely with no pre-preparation in less than 32 hours under lockdown.