Shenzhi-Wang

Follow

Shenzhi Wang Shenzhi-Wang

Follow

PhD Candidate @ Tsinghua University

86 followers · 23 following

Achievements

Achievements

Pinned Loading

Llama3-Chinese-Chat Llama3-Chinese-Chat Public

This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model.

321 18
LeapLabTHU/FamO2O LeapLabTHU/FamO2O Public

Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)

Python 37 2
recon recon Public

The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)

Python 10
hiyouga/EasyR1 hiyouga/EasyR1 Public

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1.3k 70