Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
-
Install python requirements
pip install -r requirements.txt
other dependency
- flash-attn (dropout_layer_norm) (maybe you need to compile it by yourself)
-
Pull deepspeed & add them to pythonpath
export PYTHONPATH=/path/to/DeepSpeed:$PYTHONPATH
-
Install package in development mode
pip install -e . -v
- qwen14b,
- internlm7-20b,
- baichuan1/2 (7b-13b)
- llama1-2 (7b/13b/70b)
To optimize the model training performance in terms of time and space, EasyLLM supports Dynamic Checkpoint. Based on the input token size, it enables checkpointing for some layers. The configuration file settings are as follows:
This repository is released under the Apache-2.0 license.
We learned a lot from the following projects when developing EasyLLM.