Skip to content

lbertge/longrope-dllm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LongRoPE2 for dLLMs

This repository contains the evolutionary search optimization necessary to apply LongRoPE2 to dLLMs like LLaDA and DiffuCoder. It is a fork based off the LongRoPE2 codebase and paper.

If you want to skip the evolutionary search part, you can directly install our longdllm package and follow the instructions to get started.

Quick Start

Build Environment

conda create -n longrope python==3.10
conda activate longrope
pip install -r requirements.txt

FlashAttention is required if you want to perform extension up to 128k tokens. We used FlashAttention version 2.5.8:

pip install flash_attn==2.5.8 --no-build-isolation

Tokenize Data

Tokenize PG19 as evolution search validation dataset and Proof-Pile as evaluation dataset.

bash ./scripts/llada_float16/tokenzie-data.sh

This part does not work yet, due to some special format for our version of pg19 dataset.

To generate the needle-driven data, we'll need to run the following code:

python pg19_needle_llada.py

Evolution Search

Run evolution search on Llama-3-8B model to sequence length of 128k:

bash ./scripts/llada_float16/search-llada-long-factor-cd-33-around-init.sh

The default evolution search hyperparameters are located in evolution/default_hyper_params/*.json. Users can customize the number of iterations, population size, number of parents, number of mutation and crossover operations in each iteration. These parameters will affect the convergence time and robustness of searching results.

Evaluation

Evaluate long-context perplexity and passkey accuracy:

bash ./scripts/llada_float16/evaluate.sh

Citation

If you find LongdLLM helpful, please consider citing it:

@misc{ge2025longcontext,
  title = {Towards 131k-Context dLLMs},
  url = {https://albertge.notion.site/longdllm},
  author = {Ge, Albert and Singh, Chandan and Zhang, Dinghuai and Peng, Letian and Zhuang, Yufan and Shang, Ning and Zhang, Li Lyna and Liu, Liyuan and Gao, Jianfeng},
  journal = {Albert Ge's Notion},
  year = {2025},
  month = sep,
}

About

LongRoPE for dLLMs

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published