Neural Networks and the Chomsky Hierarchy

This repository provides an implementation of our ICLR 2023 paper Neural Networks and the Chomsky Hierarchy.

Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (2200 models, 16 tasks) to investigate whether insights from the theory of computation can predict the limits of neural network generalization in practice. We demonstrate that grouping tasks according to the Chomsky hierarchy allows us to forecast whether certain architectures will be able to generalize to out-of-distribution inputs. This includes negative results where even extensive amounts of data and training time never led to any non-trivial generalization, despite models having sufficient capacity to perfectly fit the training data. Our results show that, for our subset of tasks, RNNs and Transformers fail to generalize on non-regular tasks, LSTMs can solve regular and counter-language tasks, and only networks augmented with structured memory (such as a stack or memory tape) can successfully generalize on context-free and context-sensitive tasks.

It is based on PyTorch and NumPy and contains all code, datasets, and models necessary to reproduce the paper's results.

This repository is a PyTorch port maintained by Miroslav Lzicar. If you build on this port, please add the citation from the section below alongside the original paper.

Quick start

Install in editable mode using pip install . -e
Run example like python -m nnch.experiments.example

More examples:

Run Tape RNN on Modular Arithmetics with custom params: python -m nnch.experiments.example --task=modular_arithmetic --architecture=tape_rnn --batch_size=64 --sequence_length=20

Content

.
├── models
|   ├── ndstack_rnn.py        - Nondeterministic Stack-RNN (DuSell & Chiang, 2021)
|   ├── rnn.py                - RNN (Elman, 1990)
|   ├── stack_rnn.py          - Stack-RNN (Joulin & Mikolov, 2015)
|   ├── tape_rnn.py           - Tape-RNN, loosely based on Baby-NTM (Suzgun et al., 2019) 
|   └── transformer.py        - Transformer (Vaswani et al., 2017)
├── tasks
|   ├── cs                    - Context-sensitive tasks
|   ├── dcf                   - Determinisitc context-free tasks
|   ├── regular               - Regular tasks
|   └── task.py               - Abstract GeneralizationTask
├── experiments
|   ├── constants.py          - Training/Evaluation constants
|   ├── curriculum.py         - Training curricula (over sequence lengths)
|   ├── example.py            - Example training script (RNN on the Even Pairs task)
|   ├── range_evaluation.py   - Evaluation loop (over unseen sequence lengths)
|   ├── training.py           - Training loop
|   └── utils.py              - Utility functions
├── README.md
└── requirements.txt          - Dependencies

tasks contains all tasks, organized in their Chomsky hierarchy levels (regular, dcf, cs). They all inherit the abstract class GeneralizationTask, defined in tasks/task.py.

models contains all the models we use, written in PyTorch, a popular open source deep learning framework.

training contains the code for training models and evaluating them on a wide range of lengths. We also included an example to train and evaluate an RNN on the Even Pairs task. We use PyTorch's built-in optimizers for training.

Citing This Work

If you build on this PyTorch port, please cite both the port and the original paper.

Torch Port

@misc{Lzicar2025nnch,
  author = {Miroslav Lzicar},
  title = {Neural Networks and the Chomsky Hierarchy Torch Port},
  month = {October},
  day = {25},
  year = {2025}
}

Original Paper

@inproceedings{deletang2023neural,
  author       = {Gr{\'{e}}goire Del{\'{e}}tang and
                  Anian Ruoss and
                  Jordi Grau{-}Moya and
                  Tim Genewein and
                  Li Kevin Wenliang and
                  Elliot Catt and
                  Chris Cundy and
                  Marcus Hutter and
                  Shane Legg and
                  Joel Veness and
                  Pedro A. Ortega},
  title        = {Neural Networks and the Chomsky Hierarchy},
  booktitle    = {11th International Conference on Learning Representations},
  year         = {2023},
}

License and Disclaimer

All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0

All other materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode

Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.

This is not an official Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
nnch		nnch
paper		paper
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
chomsky.svg		chomsky.svg
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Neural Networks and the Chomsky Hierarchy

Quick start

Content

Citing This Work

Torch Port

Original Paper

License and Disclaimer

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Contributors 5

Uh oh!

Languages

Uh oh!

License

mireklzicar/neural_networks_chomsky_hierarchy_torch

Folders and files

Latest commit

History

Repository files navigation

Neural Networks and the Chomsky Hierarchy

Quick start

Content

Citing This Work

Torch Port

Original Paper

License and Disclaimer

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Contributors 5

Uh oh!

Languages

Packages