This repository contains the evaluation code and test sets for Patronus' GLIDER model.
The training is done in two phases: SFT and alignment using RLAIF data. To train your model, first install the conda environment in the train_environment.yaml
file:
conda env create -f train_environment.yaml
If you wish to install flash attention (which is enabled by default), you must additionally activate enviornment and run the following
pip install flash-attn
Once the environment is set up properly, you need to setup HF accelerate configs according to your GPU configuration. To do this, you can run:
accelerate config
The GLIDER model was trained with FSDP which can be enabled using the config setup above.
Finally, to run the training:
cd train
python train_sft.py --model_output_path="[your_output_dir_here]" --dataset_path="[your dataset path here]"
python train_dpo.py --model_dir="[your_sft_saved_model_path]" --model_output_path="[your_output_dir_here]" --dataset_path="[your preference dataset path here]"
Once the training is done, you can evaluate the model using the script below. Ensure that you have vllm
and scikit-learn
installed:
cd eval
python vllm_test_local.py --model="[your_model_path]" --output_dir="[dir_to_save_your_outputs]"
python extract_accuracies.py --eval_dir="[your_outputs_path_above]" --output_file="[file_to_write_results_to.jsonl]"
Note that vLLM has an issue with the longrope implementation of phi models which is being actively fixed at the time of release of this repo. It is recommended to use this version of vLLM till the fix PR is merged into vLLM's main branch.
This code is licensed under CC-BY-NC-4.0. More information is available in the LICENSE
file.