GRiD: Dependency Matters - Enhancing LLM Reasoning with Explicit Knowledge Grounding

This repository contains the code for the paper "Dependency Matters: Enhancing LLM Reasoning with Explicit Knowledge Grounding" (NeurIPS 2025).

Overview

This project implements and evaluates various reasoning strategies for large language models (LLMs) on multiple question-answering benchmarks. The codebase supports training and evaluation of different reasoning methods including ReAct, Chain-of-Thought (CoT), direct reasoning, and several other approaches.

Supported Methods

ReAct: Reasoning and Acting
CoT: Chain-of-Thought
Direct: Direct answer generation
Think: Thinking-based reasoning
Think-Reason: Combined thinking and reasoning
Disentangle: Disentangled reasoning approach
Reasoner: Dedicated reasoner models

Supported Datasets

TruthfulQA: Truthful question answering
StrategyQA: Strategic reasoning questions
CommonsenseQA: Commonsense question answering
GPQA: Graduate-level question answering

Quick Start

Environment Setup

Install required dependencies:

pip install transformers datasets unsloth peft accelerate deepspeed wandb

Training

Train a model using one of the provided scripts. For example, to train a ReAct model on TruthfulQA:

bash train_react_TruthfulQA_template.sh

Other training scripts follow the pattern: train_{method}_{dataset}_template.sh

Evaluation

Evaluate a trained model using the evaluation scripts:

bash eval_react_TruthfulQA_template.sh

Evaluation scripts follow the pattern: eval_{method}_{dataset}_template.sh

Project Structure

GRiD/
├── dataset/              # Dataset files
├── ckpts/                # Model checkpoints
├── eval_results/         # Evaluation results
├── deepspeed_configs/    # DeepSpeed configurations
├── train_*.py           # Training scripts (Python)
├── train_*.sh           # Training scripts (Shell)
├── eval_*.py            # Evaluation scripts (Python)
├── eval_*.sh            # Evaluation scripts (Shell)
├── utils.py             # Utility functions
└── README.md            # This file

Key Features

Multiple Reasoning Methods: Compare different reasoning strategies
Multiple Datasets: Evaluate on diverse QA benchmarks
Efficient Training: Supports LoRA, quantization, and DeepSpeed
Flexible Configuration: Easy to modify training and evaluation parameters

Citation

If you use this code in your research, please cite:

@inproceedings{grid2025,
  title={Dependency Matters: Enhancing LLM Reasoning with Explicit Knowledge Grounding},
  author={Xiangyu Wen, Min Li, Junhua Huang, Zhijian Xu, Zeju Li, Yongxiang Huang, Mingxuan Yuan, Qiang Xu},
  booktitle={Advances in Neural Information Processing Systems},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
ReAct		ReAct
create_data_QA		create_data_QA
dataset		dataset
deepspeed_configs		deepspeed_configs
faithfulness_checking		faithfulness_checking
.gitignore		.gitignore
README.md		README.md
eval_cot.py		eval_cot.py
eval_cot_CommonsenseQA.sh		eval_cot_CommonsenseQA.sh
eval_cot_GPQA.sh		eval_cot_GPQA.sh
eval_cot_StrategyQA.sh		eval_cot_StrategyQA.sh
eval_cot_TruthfulQA.sh		eval_cot_TruthfulQA.sh
eval_direct.py		eval_direct.py
eval_direct_CommonsenseQA.sh		eval_direct_CommonsenseQA.sh
eval_direct_GPQA.sh		eval_direct_GPQA.sh
eval_direct_StrategyQA.sh		eval_direct_StrategyQA.sh
eval_direct_TruthfulQA.sh		eval_direct_TruthfulQA.sh
eval_disentangle.py		eval_disentangle.py
eval_disentangle_CommonsenseQA.sh		eval_disentangle_CommonsenseQA.sh
eval_disentangle_GPQA.sh		eval_disentangle_GPQA.sh
eval_disentangle_StrategyQA.sh		eval_disentangle_StrategyQA.sh
eval_disentangle_TruthfulQA.sh		eval_disentangle_TruthfulQA.sh
eval_react.py		eval_react.py
eval_react_CommonsenseQA.sh		eval_react_CommonsenseQA.sh
eval_react_GPQA.sh		eval_react_GPQA.sh
eval_react_StrategyQA.sh		eval_react_StrategyQA.sh
eval_react_TruthfulQA.sh		eval_react_TruthfulQA.sh
eval_reasoner.py		eval_reasoner.py
eval_reasoner_CommonsenseQA.sh		eval_reasoner_CommonsenseQA.sh
eval_reasoner_GPQA.sh		eval_reasoner_GPQA.sh
eval_reasoner_StrategyQA.sh		eval_reasoner_StrategyQA.sh
eval_reasoner_StrategyQA_qwen3.sh		eval_reasoner_StrategyQA_qwen3.sh
eval_reasoner_TruthfulQA.sh		eval_reasoner_TruthfulQA.sh
eval_think-reason.py		eval_think-reason.py
eval_think-reason_CommonsenseQA.sh		eval_think-reason_CommonsenseQA.sh
eval_think-reason_GPQA.sh		eval_think-reason_GPQA.sh
eval_think-reason_StrategyQA.sh		eval_think-reason_StrategyQA.sh
eval_think-reason_TruthfulQA.sh		eval_think-reason_TruthfulQA.sh
eval_think.py		eval_think.py
eval_think_CommonsenseQA.sh		eval_think_CommonsenseQA.sh
eval_think_GPQA.sh		eval_think_GPQA.sh
eval_think_StrategyQA.sh		eval_think_StrategyQA.sh
eval_think_TruthfulQA.sh		eval_think_TruthfulQA.sh
run_peft_deepspeed.sh		run_peft_deepspeed.sh
train_GPQA.sh		train_GPQA.sh
train_StrategyQA.sh		train_StrategyQA.sh
train_StrategyQA_r1_filtered.sh		train_StrategyQA_r1_filtered.sh
train_cot.py		train_cot.py
train_cot_Commonsenseqa.sh		train_cot_Commonsenseqa.sh
train_cot_GPQA.sh		train_cot_GPQA.sh
train_cot_StrategyQA.sh		train_cot_StrategyQA.sh
train_cot_TruthfulQA.sh		train_cot_TruthfulQA.sh
train_direct.py		train_direct.py
train_direct_Commonsenseqa.sh		train_direct_Commonsenseqa.sh
train_direct_GPQA.sh		train_direct_GPQA.sh
train_direct_StrategyQA.sh		train_direct_StrategyQA.sh
train_direct_TruthfulQA.sh		train_direct_TruthfulQA.sh
train_disentangle.py		train_disentangle.py
train_disentangle_llama3.sh		train_disentangle_llama3.sh
train_disentangle_qwen.sh		train_disentangle_qwen.sh
train_react.py		train_react.py
train_react_GPQA.sh		train_react_GPQA.sh
train_react_StrategyQA-gpt.sh		train_react_StrategyQA-gpt.sh
train_react_StrategyQA.sh		train_react_StrategyQA.sh
train_react_TruthfulQA.sh		train_react_TruthfulQA.sh
train_reasoner.py		train_reasoner.py
train_reasoner_Commonsenseqa.sh		train_reasoner_Commonsenseqa.sh
train_reasoner_Commonsenseqa_filtered.sh		train_reasoner_Commonsenseqa_filtered.sh
train_reasoner_GPQA.sh		train_reasoner_GPQA.sh
train_reasoner_GPQA_solver=qwen-llama-filtered-unfiltered.sh		train_reasoner_GPQA_solver=qwen-llama-filtered-unfiltered.sh
train_reasoner_StrategyQA.sh		train_reasoner_StrategyQA.sh
train_reasoner_StrategyQA_llama.sh		train_reasoner_StrategyQA_llama.sh
train_reasoner_TruthfulQA.sh		train_reasoner_TruthfulQA.sh
train_reasoner_TruthfulQA_ds.sh		train_reasoner_TruthfulQA_ds.sh
train_reasoner_qwen3.sh		train_reasoner_qwen3.sh
train_reasoner_qwen3_CommonsenseQA_GPQA.sh		train_reasoner_qwen3_CommonsenseQA_GPQA.sh
train_think-reason.py		train_think-reason.py
train_think-reason_Commonsenseqa.sh		train_think-reason_Commonsenseqa.sh
train_think-reason_GPQA.sh		train_think-reason_GPQA.sh
train_think-reason_StrategyQA.sh		train_think-reason_StrategyQA.sh
train_think-reason_TruthfulQA.sh		train_think-reason_TruthfulQA.sh
train_think.py		train_think.py
train_think_Commonsenseqa.sh		train_think_Commonsenseqa.sh
train_think_GPQA.sh		train_think_GPQA.sh
train_think_StrategyQA.sh		train_think_StrategyQA.sh
train_think_TruthfulQA.sh		train_think_TruthfulQA.sh
utils.py		utils.py
utils_bak.py		utils_bak.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GRiD: Dependency Matters - Enhancing LLM Reasoning with Explicit Knowledge Grounding

Overview

Supported Methods

Supported Datasets

Quick Start

Environment Setup

Training

Evaluation

Project Structure

Key Features

Citation

About

Uh oh!

Releases

Packages

Languages

cure-lab/GRiD

Folders and files

Latest commit

History

Repository files navigation

GRiD: Dependency Matters - Enhancing LLM Reasoning with Explicit Knowledge Grounding

Overview

Supported Methods

Supported Datasets

Quick Start

Environment Setup

Training

Evaluation

Project Structure

Key Features

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages