ππ§ (x) | x β {ππ§ ,π€}
(tricking rocks into thinking)
Highlights
- Pro
Pinned Loading
-
attention-saver
attention-saver PublicAttention Saver lets you extract entire attention matrices or row-wise statistics (e.g. entropy) from any HuggingFace causal LLM layer for ultra-long context when using flash-attention without runnβ¦
Python
-
BatchEntropyRegularization
BatchEntropyRegularization PublicA Tensorflow implementation of Batch Entropy Regularization (Peer et al 2022).
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.