Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some question on code of LISA #897

Open
milong26 opened this issue Sep 19, 2024 · 0 comments
Open

Some question on code of LISA #897

milong26 opened this issue Sep 19, 2024 · 0 comments

Comments

@milong26
Copy link

I have known that LISA's core code in src\lmflow\pipeline\finetuner.py, mainly in class DynamicLayerActivationCallback. I read it with Algorithm 1 Layerwise Importance Sampling AdamW (LISA) in paper aside.

So where is step2: Freeze all layers except the embedding and language modeling head layer? I can only find def freeze_all_layers(self) in class DynamicLayerActivationCallback, not excluding embedding and head layer

And i'm curious on the notation k in paper Algorithm 1:
step 4: Run AdamW for K iterations with ${η_t}_{t=ik}^{ik=k-1}$ Is the k same as K ?

My english is bad so tell me if any understanding problem,thanks for answering

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant