Skip to content

Latest commit

 

History

History
79 lines (62 loc) · 2.91 KB

README.md

File metadata and controls

79 lines (62 loc) · 2.91 KB

Probing LLMs for Joint Encoding of Linguistic Categories

Paper

Official repository for the paper: "Probing LLMs for Joint Encoding of Linguistic Categories." Findings of EMNLP 2023.

https://arxiv.org/abs/2310.18696

Requirements and Setup

Details such as python and package versions can be found in the generated pyproject.toml and poetry.lock files.

We recommend using an environment manager such as conda. After setting up your environment with the correct python version, please proceed with the installation of the required packages. We provide a requirements.txt file for this.

pip install -r requirements.txt

This requirements.txt file is generated by running the following

sh gen_pip_reqs.sh

Repository contents

.
├── data/                            # Where data is kept
├── experiments/                     # arrays of images
├── images/                          # more individual images
├── lisa/                            # SLURM jobs and configs
├── infoshare/
│   ├── datamodules/                 # handle data loading, processing
│   ├── models/                      # Model implementations
│   ├── run
│   │   ├── test.py                  # run testing
│   │   ├── test_xlingual.py         # run testing across languages
│   │   └── train.py                 # run training
│   ├── __init__.py
│   └── utils.py                     # general utils
├── notebooks/                       # see notebooks/README.md
├── reports/                         # LaTeX and more
├── README.md                        # you are here
├── lswsd_lemmas.txt                 # lemmas used for LSWSD
├── poetry.lock                      # dependencies metadata
├── pyproject.toml                   # project metadata
├── gen_pip_reqs.sh                  # script for generating requirements.txt
└── requirements.txt                 # required packages for PIP

The above was generated with

tree . -L 3 --dirsfirst -I "*.eps|*.png|*.pdf|lightning_logs|*pycache*|backup"

followed by some manual edits.

Citation

If you use this code or find our work otherwise useful, please consider citing our paper:

@inproceedings{starace2023probing,
  title={Probing LLMs for Joint Encoding of Linguistic Categories},
  author={Starace, Giulio and Papakostas, Konstantinos and Choenni, Rochelle and Panagiotopoulos, Apostolos and Rosati, Matteo and Leidinger, Alina and Shutova, Ekaterina},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
  pages={7158--7179},
  year={2023}
}