基于Qwen2.5+LoRA微调+RLHF+RAG的旅游路径规划智能体

环境配置

GPU: RTX3090 x 2 Platform: AutoDL

NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
CUDA=12.4
Pytorch=2.5.0

pip install -r requirements.txt

如何运行

python main.py

python rag_naive.py

Experiment Setup

Model

我们使用了Qwen2.5作为LLM模型
- 目前仅测试了Qwen2.5的1.5B参数版本

Project Structure

核心代码都放在 src/ 目录下.
src/ 的目录结构：

src:
    data:
     - processed_data
     - data_augmentation.py
     - data_preprocessor.py
     - init.py
    training:
     - dpo_trainer.py
     - sft_trainer.py
     - multi_task_trainer.py
     - init.py
    models:
     - model.py
     - init.py
    ui:
     - app.py
     - mindmap.py
     - init.py

data:
     - 各种数据集
utils.py
configs:
     - config.py
     - init.py

Dataset

我们使用了一个旅游对话数据集：CrossWOZ

Dataset Citation:

@inproceedings{zhu2020crosswoz,  
    title={CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset},  
    author={Zhu, Qi and Zhang, Zheng and Fang, Yan and Li, Xiang and Takanobu, Ryuichi and Li, Jinchao and Peng, Baolin and Gao, Jianfeng and Zhu, Xiaoyan and Huang, Minlie},  
    booktitle={Transactions of the Association for Computational Linguistics},  
    year={2020},  
    url={https://arxiv.org/abs/2002.11893}  
}

Travel Agent运行结果

RAG运行结果

运行结果解释

我们给RAG的问题包含了：question+context， context是由数据集中前5个与question最接近的样本组成的。

Citation

we refer to many other projects when building this project.
knowledge-graph-from-GPT
ai-travel-agent
GPT2
RLHF_instructGPT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

基于Qwen2.5+LoRA微调+RLHF+RAG的旅游路径规划智能体

环境配置

如何运行

Experiment Setup

Model

Project Structure

Dataset

Travel Agent运行结果

RAG运行结果

运行结果解释

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

基于Qwen2.5+LoRA微调+RLHF+RAG的旅游路径规划智能体

环境配置

如何运行

Experiment Setup

Model

Project Structure

Dataset

Travel Agent运行结果

RAG运行结果

运行结果解释

Citation