Skip to content

sajabdoli/fine_tune_LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

fine tune LLMs

This repo contains a notebook for fine tuning the LLM based on the DPO (Direct Preference Optimization). To reduce the trainable parameters we also incorporate LoRA (Low-Rank Adaptation of Large Language Models). The model is trained based on Orca-Direct-Preference-Optimization which consists of the user choices for a pair of answers from the LLM for a given prompt.

For a microsoft/phi-2 model there are: trainable params: 4,792,320 || all params: 2,784,476,160 || trainable%: 0.1721

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published