This is the reproduction package for the paper entitled Automated Attention Pattern Discovery at Scale in Large Language Models
The repository is structured into three main directories:
- Clustering
- Dataset
- Model
The Clustering directory comprises all the code for clustering attention heads and visualizing these clusters. This code corresponds to Section 5 of the paper.
The Dataset directory comprises all the code for creating the dataset. This includes scraping GitHub repositories, extracting code files, removing autogenerated files, and removing exact- and near-duplicates between our custom dataset and Java-Stack v2. This code corresponds to Section 3 of the paper.
The Model directory comprises all the code for building the AP-MAE model. This includes the model architecture and the training setup. This code corresponds to Section 4 of the paper.
For further instructions on running the code, please refer to the README files in each directory.
We also add more visualizations similar to Figure 10 in Visualization. For each SC2 size, we show a plot for every task, split between correct and incorrect.
We release the StackLessV2 Java dataset here.
We release the AP-MAE model collection here.