clean up and add ckpt tests #179

samsja · 2024-12-19T05:40:54Z

what this pr do:

clean up model/opt hash logging
add ckpt test using the hash logging
remove useless memory arguments

Jackmin801

Nice! Why are we killing GPU memory monitor though?

Jackmin801 · 2024-12-19T19:02:28Z

src/zeroband/train.py

    memory_profiler: MemoryProfilerConfig | None = None

    sequence_packing: bool = True
    attn_fn: Literal["flash", "sdpa"] | None = None

+    math_attn: bool = False  # slow


What do you think about putting this as an option in attn_fn instead?

attn_fn: Literal["flash", "sdpa"] | None = None @model_validator(mode="after") def validate_attn_fn(self): if self.attn_fn is not None: warnings.warn("attn_fn argument is deprecated") return self

hmm attn_fn is not used anymore. I just kept it to avoid conflict with old code.

pr to remove attn_fn #180

Jackmin801 · 2024-12-19T20:06:12Z

src/zeroband/train.py

@@ -164,6 +200,7 @@ def train(config: Config):
        config.type_model,
        vocab_size=len(tokenizer) if config.name_model != "debugmodel" or not config.data.fake else TEST_VOCAB_SIZE,
        seq_length=config.data.seq_length,
+        math_attn=config.train.math_attn,


What do you think about passing attn_fn instead? Would also allow sdpa to be specified

samsja · 2024-12-20T07:40:19Z

Nice! Why are we killing GPU memory monitor though?

the profiler is enough I think

I would kill it unless you had use case where you need it. I personally never used it even tho I added it haha

Jackmin801 · 2024-12-21T00:49:37Z

yea never used it either haha

samsja force-pushed the refactor-test-and-hash branch 3 times, most recently from 0fa14d0 to ca3b1c8 Compare December 19, 2024 06:20

use shared function for log modelhash

1a3b439

samsja force-pushed the refactor-test-and-hash branch 5 times, most recently from 65e7b74 to 6755478 Compare December 19, 2024 09:42

samsja added 2 commits December 19, 2024 09:51

add ckpt tests

5a39b9a

remove useless memory stuff

dbddd9f

samsja force-pushed the refactor-test-and-hash branch from 6755478 to dbddd9f Compare December 19, 2024 09:51

samsja changed the title ~~use shared function for log model hash~~ clean up and add ckpt tets Dec 19, 2024

samsja changed the title ~~clean up and add ckpt tets~~ clean up and add ckpt tests Dec 19, 2024

samsja requested review from Jackmin801 and JohannesHa December 19, 2024 09:57

Jackmin801 requested changes Dec 19, 2024

View reviewed changes

samsja merged commit 4715633 into main Dec 23, 2024
2 checks passed

samsja deleted the refactor-test-and-hash branch December 23, 2024 04:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clean up and add ckpt tests #179

clean up and add ckpt tests #179

samsja commented Dec 19, 2024 •

edited

Loading

Jackmin801 left a comment

Jackmin801 Dec 19, 2024

samsja Dec 20, 2024

samsja Dec 20, 2024

Jackmin801 Dec 19, 2024

samsja commented Dec 20, 2024 •

edited

Loading

Jackmin801 commented Dec 21, 2024

clean up and add ckpt tests #179

clean up and add ckpt tests #179

Conversation

samsja commented Dec 19, 2024 • edited Loading

Jackmin801 left a comment

Choose a reason for hiding this comment

Jackmin801 Dec 19, 2024

Choose a reason for hiding this comment

samsja Dec 20, 2024

Choose a reason for hiding this comment

samsja Dec 20, 2024

Choose a reason for hiding this comment

Jackmin801 Dec 19, 2024

Choose a reason for hiding this comment

samsja commented Dec 20, 2024 • edited Loading

Jackmin801 commented Dec 21, 2024

samsja commented Dec 19, 2024 •

edited

Loading

samsja commented Dec 20, 2024 •

edited

Loading