A minimalist & hackable torch implementation of DiLoCo: Distributed Low-Communication Training of Language Models.
modal run -m scripts.train_modal::small_single_node
| Name | Name | Last commit date | ||
|---|---|---|---|---|
A minimalist & hackable torch implementation of DiLoCo: Distributed Low-Communication Training of Language Models.
modal run -m scripts.train_modal::small_single_node