Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
0c7bbc4
finished reworking initial components
bryce13950 Nov 15, 2023
444fb32
reworked components a bit
bryce13950 Nov 17, 2023
d16f3bf
Merge branch 'main' into refactor-components
bryce13950 Nov 21, 2023
68c0c15
fixed cached data setting
bryce13950 Nov 21, 2023
11105de
ran format
bryce13950 Nov 21, 2023
7ba755f
reformatted components section
bryce13950 Nov 21, 2023
91a5712
reverted attention changes
bryce13950 Nov 22, 2023
cecf93e
reverted layer norm changes
bryce13950 Nov 22, 2023
9d8e91a
reverted mlp changes
bryce13950 Nov 22, 2023
b9a9ba5
Revert "reverted attention changes"
bryce13950 Nov 22, 2023
da0acf3
Revert "reverted layer norm changes"
bryce13950 Nov 22, 2023
10d609e
Revert "reverted mlp changes"
bryce13950 Nov 22, 2023
ef4518b
removed some model tests
bryce13950 Nov 22, 2023
6cfb08f
Revert "removed some model tests"
bryce13950 Nov 22, 2023
8832b7d
removed model tests
bryce13950 Nov 22, 2023
502fe65
added model back
bryce13950 Nov 22, 2023
beb014e
lowered accuracy
bryce13950 Nov 22, 2023
a7ca7ea
Revert "lowered accuracy"
bryce13950 Nov 22, 2023
fbf03a6
added model back to test loop
bryce13950 Nov 22, 2023
1c7a8bd
reverted accuracy change
bryce13950 Nov 23, 2023
42c195a
added proper headers
bryce13950 Nov 23, 2023
a6d3603
Merge branch 'main' into refactor-components
bryce13950 Dec 8, 2023
ce82675
Clean up project config (#463)
alan-cooney Dec 10, 2023
af99428
added import again
bryce13950 Dec 11, 2023
11f2088
updated attention component
bryce13950 Dec 11, 2023
98486bd
added new line
bryce13950 Dec 11, 2023
7241042
Closes #478: Adding the Qwen family of models (#477)
Aaquib111 Jan 16, 2024
5754a0b
Add a function to convert nanogpt weights (#475)
adamkarvonen Jan 16, 2024
535fadf
Add support for CodeLlama-7b (#469)
YuhengHuang42 Jan 17, 2024
33222e5
Make LLaMA 2 loadable directly from HF (#458)
andyrdt Jan 17, 2024
6867800
Fixe #371: Resolve issues where LLama will not load on CUDA (#461)
artkpv Jan 17, 2024
a5147ba
Add support for larger Bloom models (up to 7b) (#447)
SeuperHakkerJa Jan 17, 2024
11edb28
Add mistral 7b support (#443)
Felhof Jan 22, 2024
19b3bc8
Implement RMS Layer Norm folding (#489)
collingray Jan 23, 2024
ba3fb3b
Cap Mistral's context length at 2k (#495)
collingray Jan 28, 2024
8a17a76
Add Microsoft Phi models support (#484)
cmathw Jan 28, 2024
829084a
Fix a redundant MLP bias assignment (#485)
adamkarvonen Jan 28, 2024
109fd99
add qwen1.5 models (#507)
andyrdt Mar 7, 2024
6673d88
Support Gemma Models (#511)
cmathw Mar 14, 2024
93c2246
make tests pass mps (#528)
jbloomAus Mar 26, 2024
f6892d4
Add support for Llama-2-70b-chat-hf (#525)
sheikheddy Mar 28, 2024
edf40df
Update loading_from_pretrained.py (#529)
jbloomAus Mar 28, 2024
14b8e2e
Bugfix: pytest import (#532)
tkukurin Apr 1, 2024
5bf9acb
Remove non-existing parameter from decompose_resid documentation (#504)
VasilGeorgiev39 Apr 2, 2024
f773b29
Add `@overload` to `FactoredMatrix.__{,r}matmul__` (#512)
JasonGross Apr 2, 2024
42c1602
Explain abstract attribute in more detail (#508)
Felhof Apr 2, 2024
3f5db9f
Add pos_slice to run_with_cache (#465)
VasilGeorgiev39 Apr 2, 2024
72d5ae3
Add Support for Yi-6B and Yi-34B (#494)
collingray Apr 3, 2024
de9a70b
updated docs to account for additional test suites (#533)
bryce13950 Apr 3, 2024
bae7977
bugfix subscripted generics (#534)
tkukurin Apr 3, 2024
f052f39
Fix platform markers (#510)
pavanyellow Apr 5, 2024
4ca06e7
Add Xavier and Kaiming Initializations (#537)
Chanlaw Apr 8, 2024
afe7900
chore: fixing type errors and enabling mypy (#516)
chanind Apr 8, 2024
760135a
Add Mixtral (#521)
collingray Apr 8, 2024
e198006
Standardize black line length to 100, in line with other project sett…
Chanlaw Apr 11, 2024
b156ce1
Refactor hook_points (#505)
VasilGeorgiev39 Apr 13, 2024
1553e81
Fix split_qkv_input for grouped query attention (#520)
wesg52 Apr 13, 2024
3aaac2e
locked attribution patching to 1.1.1 (#541)
bryce13950 Apr 15, 2024
d2415b4
Demo no position fix (#544)
bryce13950 Apr 16, 2024
f22a406
Othello colab fix (#545)
bryce13950 Apr 16, 2024
65df48b
fixed demo for current colab (#546)
bryce13950 Apr 16, 2024
0e86253
Hf token auth (#550)
bryce13950 Apr 24, 2024
d8270c8
Fixed device being set to cpu:0 instead of cpu (#551)
Butanium Apr 24, 2024
2092dc9
Add support for Llama 3 (and Llama-2-70b-hf) (#549)
joelburget Apr 24, 2024
fe89b04
Loading of huggingface 4-bit quantized Llama (#486)
coolvision Apr 24, 2024
6cd64d5
removed deuplicate rearrange block (#555)
bryce13950 Apr 25, 2024
1139caf
Bert demo ci (#556)
bryce13950 Apr 26, 2024
c0f6729
Merge branch 'main' into refactor-components
bryce13950 Apr 27, 2024
07e8f38
removed import
bryce13950 Apr 27, 2024
517fab5
cleaned imports
bryce13950 Apr 27, 2024
4053079
ran black
bryce13950 Apr 27, 2024
2d1ee77
updated doc string
bryce13950 Apr 27, 2024
38b4dda
finished fixing docstring
bryce13950 Apr 27, 2024
e2e0578
fixed mypi error
bryce13950 Apr 27, 2024
ca6b8db
HookedSAETransformer (#536)
ckkissane Apr 30, 2024
6293e86
reworked CI to publish code coverage report (#559)
bryce13950 Apr 30, 2024
3d0a87a
Merge branch 'dev' into refactor-components
bryce13950 May 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 100 additions & 30 deletions .github/workflows/checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,24 @@ on:
branches:
- main
paths:
- '**' # Include all files by default
- '!.devcontainer/**'
- '!.vscode/**'
- '!.git*'
- '!*.md'
- '!.github/**'
- '.github/workflows/checks.yml' # Still include current workflow
- "**" # Include all files by default
- "!.devcontainer/**"
- "!.vscode/**"
- "!.git*"
- "!*.md"
- "!.github/**"
- ".github/workflows/checks.yml" # Still include current workflow
pull_request:
branches:
- main
paths:
- '**'
- '!.devcontainer/**'
- '!.vscode/**'
- '!.git*'
- '!*.md'
- '!.github/**'
- '.github/workflows/checks.yml'
- "**"
- "!.devcontainer/**"
- "!.vscode/**"
- "!.git*"
- "!*.md"
- "!.github/**"
- ".github/workflows/checks.yml"
# Allow this workflow to be called from other workflows
workflow_call:
inputs:
Expand All @@ -36,16 +36,15 @@ permissions:
contents: write

jobs:
checks:
name: Checks
compatibility-checks:
name: Compatibility Checks
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
steps:
- uses: actions/checkout@v3
- name: Install Poetry
Expand All @@ -67,20 +66,20 @@ jobs:
run: |
poetry lock --check
poetry install --with dev
- name: Check format
run: make check-format
- name: Unit test
- name: Unit Test
run: make unit-test
- name: Docstring test
run: make docstring-test
# - name: Type check
# run: poetry run mypy transformer_lens
- name: Acceptance Test
run: make acceptance-test
- name: Build check
run: poetry build

# Acceptance tests are run in parallel with unit checks.
acceptance-tests:
name: Acceptance Tests
- name: Upload Coverage Report Artifact
uses: actions/upload-artifact@v3
with:
name: documentation
path: htmlcov

code-checks:
name: Code Checks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
Expand All @@ -103,8 +102,21 @@ jobs:
run: |
poetry lock --check
poetry install --with dev
- name: Acceptance test
run: make acceptance-test
- name: Check format
run: make check-format
- name: Docstring test
run: make docstring-test
- name: Type check
run: poetry run mypy .
- name: Test Suite with Coverage Report
run: make coverage-report-test
- name: Build check
run: poetry build
- name: Upload Coverage Report Artifact
uses: actions/upload-artifact@v3
with:
name: test-coverage
path: htmlcov

notebook-checks:
name: Notebook Checks
Expand Down Expand Up @@ -135,3 +147,61 @@ jobs:
- name: Check Notebook Output Consistency
# Note: currently only checks notebooks we have specifically setup for this
run: make notebook-test


build-docs:
# When running on a PR, this just checks we can build the docs without errors
# When running on merge to main, it builds the docs and then another job deploys them
name: ${{ github.event_name == 'pull_request' && 'Check Build Docs' || 'Build Docs' }}
runs-on: ubuntu-latest
needs: code-checks
steps:
- uses: actions/checkout@v4
- name: Install Poetry
uses: snok/install-poetry@v1
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.11"
cache: "poetry"
- name: Install pandoc
uses: awalsh128/cache-apt-pkgs-action@latest
with:
packages: pandoc
version: 1.0
- name: Install dependencies
run: poetry install --with docs
- name: Download Test Coverage Artifact
uses: actions/download-artifact@v3
with:
name: test-coverage
path: docs/source/coverage
- name: Build Docs
run: HF_TOKEN="$HF_TOKEN" poetry run build-docs
env:
HF_TOKEN: "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"
- name: Upload Docs Artifact
uses: actions/upload-artifact@v3
with:
name: documentation
path: docs/build

deploy-docs:
name: Deploy Docs
runs-on: ubuntu-latest
# Only run if merging a PR into main
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: build-docs
steps:
- uses: actions/checkout@v4
- name: Download Docs Artifact
uses: actions/download-artifact@v3
with:
name: documentation
path: docs/build
- name: Upload to GitHub Pages
uses: JamesIves/github-pages-deploy-action@v4
with:
folder: docs/build
clean-exclude: |
*.*.*/
61 changes: 0 additions & 61 deletions .github/workflows/gh-pages.yml

This file was deleted.

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ docs/build
.Ds_Store
.pylintrc
docs/source/generated
**.orig
40 changes: 38 additions & 2 deletions .vscode/cspell.json
Original file line number Diff line number Diff line change
@@ -1,41 +1,53 @@
{
"language": "en,en-GB",
"words": [
"accum",
"adrià",
"aengus",
"allclose",
"alonso",
"arange",
"argmax",
"argmaxy",
"autodiff",
"autoregressive",
"barez",
"Beartype",
"beartype",
"belrose",
"bertsimas",
"biderman",
"bilal",
"bincount",
"caxis",
"checkpointed",
"chughtai",
"circuitsvis",
"Codespaces",
"Codeparrot",
"codespaces",
"colab",
"collectstart",
"colour",
"conmy",
"cooney",
"crfm",
"cumsum",
"datapoint",
"dictmodel",
"dimitris",
"disconfirm",
"dmitrii",
"docstrings",
"doctest",
"doctree",
"dtype",
"dtypes",
"einops",
"elhage",
"endoftext",
"eqnarray",
"esben",
"evals",
"explictly",
"fazl",
"firstpage",
"fspath",
Expand All @@ -49,53 +61,77 @@
"howpublished",
"huggingface",
"icml",
"idxs",
"imshow",
"interp",
"interpretability",
"ioannis",
"ipynb",
"isin",
"isort",
"janiak",
"Janky",
"jaxtyping",
"jett",
"kaiming",
"keepdim",
"kissane",
"konstas",
"kran",
"lastpage",
"layernorm",
"ldim",
"lieberum",
"logits",
"logsumexp",
"mavor",
"maxdepth",
"mingpt",
"nanda",
"ndarray",
"ndim",
"neel",
"neox",
"nitpicky",
"occurences",
"olah",
"openwebtext",
"overcomplete",
"Overriden",
"pagename",
"pauly",
"pretrained",
"probs",
"producting",
"pycln",
"pypi",
"pytest",
"randn",
"rdim",
"relu",
"resid",
"rprint",
"rtml",
"rtol",
"shortformer",
"softmax",
"softmaxing",
"solu",
"stas",
"templatedir",
"templatename",
"toctree",
"topk",
"tqdm",
"transformerlens",
"tril",
"triu",
"troitskii",
"unembed",
"unembedded",
"unembedding",
"unigram",
"unsqueeze",
"virtualenvs",
"visualisation",
"xaxis",
Expand Down
5 changes: 3 additions & 2 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,9 @@
"ms-toolsai.jupyter",
"richie5um2.vscode-sort-json",
"stkb.rewrap",
"streetsidesoftware.code-spell-checker-british-english",
"streetsidesoftware.code-spell-checker",
"yzhang.markdown-all-in-one",
"streetsidesoftware.code-spell-checker-british-english"
"tamasfe.even-better-toml",
"yzhang.markdown-all-in-one"
]
}
Loading