Multi-hop reasoning using KG-RAG #8
Replies: 8 comments 11 replies
-
This sounds cool. Do you envision the stitching of the hops to be done by the LLM or by the RAG? |
Beta Was this translation helpful? Give feedback.
-
If you do 2-hop, then each hop is directly relevant to the query, but for 4-hop, the intermediary hops are not tied to either end. How will you know what to include? |
Beta Was this translation helpful? Give feedback.
-
Interesting! In a real query, how would you know to generate B2 -> C1 and C1 -> D1 among the myriads of things you could put in the context? It sounds like you'd need to do the 4-hop query on the backend side already? In which case, why not make it simpler for the LLM by pre-processing it? |
Beta Was this translation helpful? Give feedback.
-
Hi Karthik! For your multi-hop example, are you restricting the predicates and categories in the query? Or are you finding any 5-hop connection between the disease and the compound? Do you have a Cypher query that expresses this 5 hop? Thanks! |
Beta Was this translation helpful? Give feedback.
-
So the Cypher is easy MATCH (c:Compound {name:phenylalanine})-[]-(d:Disease {name: phenulketonuria}). The real problem is the computational cost, particularly with something the size of SPOKE. We could limit the depth of the query (i.e. [..5]) or use a shortest path algorithm, as was already mentioned. I think there are also some other challenges. In you example above, I think what people are more likely going to put in is: "Why does phenylaline cause phenylketonuria?" or perhaps "How does phenylaline cause phenylketonuria?". So the name-entity recognition is going to be somewhat more challenging, but certainly tractable. There may need to be a dialog loop to resolve ambiguity: "Are you asking about the compound phenylalinine?", etc. |
Beta Was this translation helpful? Give feedback.
-
@webyrd Yes, for the example I showed, the path is constrained on the specific predicates. As @scootermorris mentioned, cypher is straightforward here as he showed and I want to resonate what @scootermorris mentioned i.e. the computational complexity that we may encounter as 'n' increases in the n-hop reasoning. So, I think the options can be:
Maybe I might have missed other good options here, please feel free to chime in and include them as well :) @scootermorris I like the idea of 'dialogue loop'! I think that would help us to resolve the ambiguity of duplicate names for different node types. |
Beta Was this translation helpful? Give feedback.
-
Maybe of interest: https://arxiv.org/abs/2308.14321 |
Beta Was this translation helpful? Give feedback.
-
… On Thu, Dec 7, 2023 at 6:12 PM karthik-soman ***@***.***> wrote:
Thanks @MadhumitaSushil <https://github.com/MadhumitaSushil>, will check
this out.
—
Reply to this email directly, view it on GitHub
<#8 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAI5HBBR2JME4E4KP5DQKNLYIJEPRAVCNFSM6AAAAABAGYTUL2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOOJUGU4DQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Currently, we can do a 2 hop reasoning using KG-RAG. For example, we can pull off a scenario such as 'What are the common genes associated with DiseaseX and DiseaseY'. This is A->B<-C scenario.
I was thinking if we could take this to the next step i.e. to address questions such as 'Give me the paths between DiseaseX and DiseaseY that are within 4 hops distance'. This is a A->B1->B2->C scenario.
I think such multi-hop graph traversal using natural language would be awesome!
Any ideas?
Beta Was this translation helpful? Give feedback.
All reactions