GitHub - ArjunDosajh/BayMaxGPT: BayMax is a medical chatbot that can answer questions related to medicines (in this case, but can be used over any of the documents that you want to run it over) and it gives the answers by fetching the facts given in the data in realtime, thus making it a very fast and efficient chatbot.

BayMax : An AI Medical Assistant

BayMax is a medical care assistant, powered by an LLM, that can answer questions related to medicines (in this case, but can be used over any of the documents that you want to run it over) and it gives the answers by fetching the facts given in the data in realtime, thus making it a very fast and efficient chatbot.

The model successfully runs on a laptop with 16GB of RAM and 6GB of VRAM

Advantages of BayMax

Run on baremetal (both on laptops and mobile phones)
Doesn't need any sort of internet connection
Doesn't hallucinate over random knowledge as it uses the facts provided in the data to generate the answer
Data Agnostic : Can be used over any of the documents that you want to run it over as it uses RAG technique and doesn't need any sort of finetuning

How it works

We first create the vector database using MiniLM embeddings and Faiss indexing
Then we use RAG to fetch the answers from the database by converting the Queries given by the user into embeddings and then using Faiss to fetch the top 5 most similar embeddings and then using RAG to fetch the answers from the database
We then send these answers to the model (here we have used quantized Mistral 7B) along with the original query to generate the final answer.
In the webapp, if the user chooses the option to use the context from the data that has been provided, then we also show the context which the model has used to generate the answer to cross verify the legitimacy of the answer

Future Works

Adding more models
Using the links in the mashqa dataset and provide them in the context as well to give one more step for the users to cross verify the answer and check the legitimacy of the answer given by the model
Adding more data to the dataset to make the model more robust

Instructions to run the model on a new dataset / Understand the working of the code

Install the dependencies using pip install -r requirements.txt
Unzip the dataset from mashqa.zip and place it in the data folder.
Run data_preprocess.ipynb notebook to clean the data and store it in the cleaned_data folder
Run RAG.ipynb notebook to create the vector database and store it in the vector_db folder
The quantized .ptl file for PyTorch Mobile based android application can be downloaded from here. To create your own .ptl files, use Create_PTL.ipynb'.

Developers

as part of Megathon '23

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Dataset		Dataset
MedQueryGPT Android App/app/build		MedQueryGPT Android App/app/build
cleaned_data		cleaned_data
icons		icons
images		images
vector_db		vector_db
.gitignore		.gitignore
Create_PTL.ipynb		Create_PTL.ipynb
RAG.ipynb		RAG.ipynb
RAG.py		RAG.py
README.md		README.md
data_preprocess.ipynb		data_preprocess.ipynb
model.py		model.py
requirements.txt		requirements.txt
webapp.py		webapp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BayMax : An AI Medical Assistant

Advantages of BayMax

How it works

Future Works

Instructions to run the model on a new dataset / Understand the working of the code

Developers

About

Releases

Packages

Contributors 3

Languages

ArjunDosajh/BayMaxGPT

Folders and files

Latest commit

History

Repository files navigation

BayMax : An AI Medical Assistant

Advantages of BayMax

How it works

Future Works

Instructions to run the model on a new dataset / Understand the working of the code

Developers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages