length-generalization

Here are 3 public repositories matching this topic...

Baran-phys / Tropical-Attention

[NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"

transformers dynamic-programming attention-mechanism softmax tropical-geometry amoeba noise-robustness value-generation ood-generalization length-generalization neural-algorithmic-reasoning tropical-attention

Updated Oct 23, 2025
Python

zdxdsw / inductive_counting_with_LMs

Star

This work provides extensive empirical results on training LMs to count. We find that while traditional RNNs trivially achieve inductive counting, Transformers have to rely on positional embeddings to count out-of-domain. Modern RNNs (e.g. rwkv, mamba) also largely underperform traditional RNNs in generalizing counting inductively.

inductive-biases inductive-counting length-generalization language-model-architectures