Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add G-retriever (GNN+LLM) example #9167

Merged
merged 806 commits into from
Sep 13, 2024
Merged

Add G-retriever (GNN+LLM) example #9167

merged 806 commits into from
Sep 13, 2024

Conversation

puririshi98
Copy link
Contributor

@puririshi98 puririshi98 commented Apr 8, 2024

  1. Integrate LLM class #9462
  2. Add nn.models.GRetriever #9480
  3. Add WebQSPDataset #9481
  4. -> Add G-retriever (GNN+LLM) example #9167

repro:
Latest NVIDIA PyG container
+
git config --global credential.helper store; huggingface-cli login; cd /opt/pyg; pip uninstall -y torch-geometric; rm -rf pytorch_geometric; git clone -b gnn-llm-model-integration https://github.com/pyg-team/pytorch_geometric.git; cd /opt/pyg/pytorch_geometric; pip install .; pip install peft datasets transformers pcst_fast sentencepiece; python3 examples/llm_plus_gnn/g_retriever.py

old PR: #9154

note: pure cpu is 220x slower than pure GPU using a single Grace Hopper (for llama-7b)

info:
tried gemma, performs worse in all train/val/test metrics. most likely needs some tuning, will leave this as future work as part of the community sprint to try many LLM and GNN combos and tune them. Therefore keeping the default llama2

the new gemma-v2 is also much worse than llama2

@puririshi98 puririshi98 changed the title G-retriever (GNN+LLM) example w/ demo, w/ GNN+LLM integration G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Apr 8, 2024
@puririshi98 puririshi98 self-assigned this Apr 8, 2024
@puririshi98
Copy link
Contributor Author

@Kh4L reviews welcome

Copy link

codecov bot commented Apr 8, 2024

Codecov Report

Attention: Patch coverage is 16.10738% with 250 lines in your changes are missing coverage. Please review.

Project coverage is 87.21%. Comparing base (61c47ee) to head (48be260).
Report is 10 commits behind head on master.

Current head 48be260 differs from pull request most recent head 6e0151f

Please upload reports for the commit 6e0151f to get more accurate results.

Files Patch % Lines
torch_geometric/nn/text/llm.py 16.41% 112 Missing ⚠️
torch_geometric/nn/models/g_retriever.py 11.66% 106 Missing ⚠️
...eometric/nn/text/sentence_transformer_embedding.py 20.00% 32 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9167      +/-   ##
==========================================
- Coverage   87.33%   87.21%   -0.13%     
==========================================
  Files         460      477      +17     
  Lines       30385    31051     +666     
==========================================
+ Hits        26536    27080     +544     
- Misses       3849     3971     +122     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@puririshi98 puririshi98 changed the title G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Draft: G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Apr 10, 2024
@puririshi98 puririshi98 changed the title Draft: G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Apr 10, 2024
@puririshi98 puririshi98 changed the title G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Draft: G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Apr 10, 2024
@puririshi98 puririshi98 changed the title Draft: G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Apr 10, 2024
@puririshi98 puririshi98 changed the title G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Draft: G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Apr 10, 2024
@puririshi98 puririshi98 changed the title Draft: G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration G-retriever (GNN+LLM) example w/ demo & GNN+LLM integration Apr 16, 2024
torch_geometric/nn/models/gnn_llm.py Outdated Show resolved Hide resolved
torch_geometric/datasets/web_qsp_dataset.py Outdated Show resolved Hide resolved
torch_geometric/datasets/web_qsp_dataset.py Outdated Show resolved Hide resolved
torch_geometric/datasets/web_qsp_dataset.py Outdated Show resolved Hide resolved
torch_geometric/datasets/web_qsp_dataset.py Outdated Show resolved Hide resolved
torch_geometric/datasets/web_qsp_dataset.py Outdated Show resolved Hide resolved
torch_geometric/nn/models/gnn_llm.py Outdated Show resolved Hide resolved
torch_geometric/nn/models/gnn_llm.py Outdated Show resolved Hide resolved
torch_geometric/nn/models/gnn_llm.py Outdated Show resolved Hide resolved
torch_geometric/nn/models/gnn_llm.py Outdated Show resolved Hide resolved
torch_geometric/nn/models/gnn_llm.py Outdated Show resolved Hide resolved
@puririshi98 puririshi98 mentioned this pull request Apr 25, 2024
puririshi98 added a commit that referenced this pull request Apr 27, 2024
cleaning up to address review for
#9167

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@puririshi98
Copy link
Contributor Author

@akihironitta @rusty1s addressed reviews, plz lmk if anything else needed to merge.

Copy link
Member

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the extreme delay :( Will have a look during this weekend (long weekend in UK) :)

@puririshi98
Copy link
Contributor Author

@akihironitta anything else needed to merge?

rusty1s added a commit that referenced this pull request May 22, 2024
Splits #9167 into
multiple PRs.

---------

Co-authored-by: puririshi98 <puririshi98@berkeley.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rishi Puri <deepstyle42@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
@codecov-commenter
Copy link

codecov-commenter commented May 24, 2024

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Please upload report for BASE (web-qsp-integ@87bd353). Learn more about missing BASE report.

Files with missing lines Patch % Lines
torch_geometric/nn/nlp/llm.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff                @@
##             web-qsp-integ    #9167   +/-   ##
================================================
  Coverage                 ?   87.31%           
================================================
  Files                    ?      481           
  Lines                    ?    31345           
  Branches                 ?        0           
================================================
  Hits                     ?    27369           
  Misses                   ?     3976           
  Partials                 ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rusty1s added a commit that referenced this pull request Sep 13, 2024
1. #9462
2. #9480
3. **->** #9481
4. #9167

---

Breaking down PR
#9167 further

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
Base automatically changed from web-qsp-integ to master September 13, 2024 03:06
@github-actions github-actions bot removed the dataset label Sep 13, 2024
@rusty1s rusty1s merged commit 12421c2 into master Sep 13, 2024
16 checks passed
@rusty1s rusty1s deleted the gnn-llm-model-integration branch September 13, 2024 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants