Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 1.82 KB

README.md

File metadata and controls

3 lines (2 loc) · 1.82 KB

Disease-inference-from-QA-pairs

This project involves in predicting the disease the user might have based on the questions asked by the users on the health care forrums, since limited number of doctors are available in the forrums and the users of these forrums are increasing this requires the system to be automated. But the accuracy of the system is still in question since the users often fail to give the right description of the disease in order to meet ths problem we hav designed a new concept of producing a subgraph based solution that will detect the medical terms used in thequesion and replace it with more accurate normalized medical terms while training the model and also as and when the user puts up the question in the system.In order to predict the answers for the users question we have developed something known as a sparsed neural network which will use the already used questions and answers from authentic health care forums like webmd ane Health tap. We scrapped the data from these websites using automation and webscapping API called as beautifulsoup. We extracted noun phrase from these question answer pairs and the noun phrases were then used to check whether they are medical terms or normal phrases using the pubmed and LDC as medical corpus , once medical terminologies were detected they were normalized using snomed CT this requires a library called the Pymedtermino. Once the medical terms were searched on snomed browser and their correspondig normalized terms were extracted, they wre used for the trainig with simple neural network to get pseudo results but for actual execution we require the subgraphs from these terms , this will reduce the vocabulary gap between the user and medical officers.The signatures are the subgraphs generted by using cliques with medical terms in n-grams over the medical question answer pairs.