Persistent Topological Features in Large Language Models

Arxiv paper: https://arxiv.org/abs/2410.11042

With this repo is it possibile to reproduce the data of the paper Persistent topological features in large language models

representation: folder for the extraction of the hidden representations.
zigzag :folder for the execution fo the ZigZag algorithm over the representations.
benchmark: folder where it is possible to run the benchmarks with the different prunig methods.
- Benchmark with lm-evaluation-harness.
- Extraction of the blocks to cut with the Angular distance and Bi-Score are done with a modified version of short-transformers (to make it run also with Pyhia 6.9B).
plots: folder where it is possible to reproduce the plots of the paper.

To install the library for FastZigZag refer to their paper and their github folder.

To install the rest of the environment conda env create -f environment.yml

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
README.md		README.md
environment.yml		environment.yml