A reference of 'AlphaFold2 Codec' include everything of AlphaFold2.

Learning Source Availability

Papers

Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature (2021). https://doi.org/10.1038/s41586-021-03819-2

PPT

My Public talk on Alphafold2 Paper Reading By Xingqiang,Chen .Key/.pptx in AF2-PPT file.
Sergey Ovchinnikov talk on AF2 slides /.pptx in AF2-PPT file.

Learning by Code

📓 AlphaFold2 Algorithm Notebooks (32 Complete!)

We provide 32 Jupyter Notebooks covering every algorithm from the AlphaFold2 supplementary materials. Each notebook includes:

Algorithm pseudocode/image reference
Source code location mapping
NumPy implementation
Executable test cases with verification

👉 Full Algorithm Index

Quick Links by Category

Category	Algorithms	Notebooks
Data Preprocessing	MSA Block Deletion	Algorithm 1
Embedding	Input Embedder, relpos, one_hot	Alg 3, Alg 4, Alg 5
Evoformer	Stack, MSA Attention, Triangle Ops	Alg 6-15
Templates	Pair Stack, Pointwise Attention	Alg 16, Alg 17
Extra MSA	Stack, Global Attention	Alg 18, Alg 19
Structure Module	IPA, Backbone, Atom Coords	Alg 20-25
Losses	FAPE, Torsion, pLDDT	Alg 26-29
Recycling	Inference, Training, Embedder	Alg 30, Alg 31, Alg 32
Main Pipeline	Full Inference	Algorithm 2

📋 Complete Algorithm List (Click to Expand)

#	Algorithm	Notebook Link
1	MSA Block Deletion	algorithm-1-MSABlockDeletion.ipynb
2	Inference	algorithm-2-Inference.ipynb
3	Input Embedder	algorithm-3-InputEmbedder.ipynb
4	relpos	algorithm-4-relpos.ipynb
5	one_hot	algorithm-5-one_hot.ipynb
6	Evoformer Stack	algorithm-6-EvoformerStack.ipynb
7	MSA Row Attention with Pair Bias	algorithm-7-MSARowAttentionWithPairBias.ipynb
8	MSA Column Attention	algorithm-8-MSAColumnAttention.ipynb
9	MSA Transition	algorithm-9-MSATransition.ipynb
10	Outer Product Mean	algorithm-10-OuterProductMean.ipynb
11	Triangle Multiplication (Outgoing)	algorithm-11-TriangleMultiplicationOutgoing.ipynb
12	Triangle Multiplication (Incoming)	algorithm-12-TriangleMultiplicationIncoming.ipynb
13	Triangle Attention (Starting Node)	algorithm-13-TriangleAttentionStartingNode.ipynb
14	Triangle Attention (Ending Node)	algorithm-14-TriangleAttentionEndingNode.ipynb
15	Pair Transition	algorithm-15-PairTransition.ipynb
16	Template Pair Stack	algorithm-16-TemplatePairStack.ipynb
17	Template Pointwise Attention	algorithm-17-TemplatePointwiseAttention.ipynb
18	Extra MSA Stack	algorithm-18-ExtraMsaStack.ipynb
19	MSA Column Global Attention	algorithm-19-MSAColumnGlobalAttention.ipynb
20	Structure Module	algorithm-20-StructureModule.ipynb
21	Rigid from 3 Points	algorithm-21-rigidFrom3Points.ipynb
22	Invariant Point Attention	algorithm-22-InvariantPointAttention.ipynb
23	Backbone Update	algorithm-23-BackboneUpdate.ipynb
24	Compute All Atom Coordinates	algorithm-24-computeAllAtomCoordinates.ipynb
25	makeRotX	algorithm-25-makeRotX.ipynb
26	Rename Symmetric Ground Truth Atoms	algorithm-26-renameSymmetricGroundTruthAtoms.ipynb
27	Torsion Angle Loss	algorithm-27-torsionAngleLoss.ipynb
28	Compute FAPE	algorithm-28-computeFAPE.ipynb
29	Predict Per-Residue LDDT	algorithm-29-predictPerResidueLDDT.ipynb
30	Recycling (Inference)	algorithm-30-RecyclingInference.ipynb
31	Recycling (Training)	algorithm-31-RecyclingTraining.ipynb
32	Recycling Embedder	algorithm-32-RecyclingEmbedder.ipynb

📓 AlphaFold3 Algorithm Notebooks (NEW!)

We now include AlphaFold3 algorithm notebooks! AF3 introduces significant architectural changes including diffusion-based structure prediction.

👉 AlphaFold3 Algorithm Index

Key AF3 Components

Category	Key Algorithms	Notebooks
Input	MSA Features, Templates, Atom Features	Alg 1-4
MSA Module	Outer Product, MSA Attention	Alg 5-7
Pairformer	Triangle Ops, Single Attention	Alg 8-14
Diffusion	Diffusion Module, AdaLN, Transformer	Alg 15, Alg 16
Confidence	Distogram, Confidence, LDDT	Alg 20-23

AF3 Source Code Submodules

# Official AlphaFold3
AF3-Ref-src/alphafold3-official/

# PyTorch Implementation (lucidrains)
AF3-Ref-src/alphafold3-pytorch/

# Architecture Walkthrough
AF3-Ref-src/alphafold3-walkthrough/

📓 Boltz Algorithm Notebooks (NEW!)

We now include Boltz algorithm notebooks! Boltz is a family of models for biomolecular interaction prediction:

Boltz-1: First fully open source model to approach AlphaFold3 accuracy
Boltz-2: Adds binding affinity prediction, approaching FEP accuracy 1000x faster

👉 Boltz Algorithm Index

Key Boltz Components

Category	Key Algorithms	Notebooks
Input Processing	Input Embedder, Atom Encoder, RelPos	Alg 1-3
MSA Module	MSA Module, Outer Product, Pair Averaging	Alg 4-6
Pairformer	Pairformer, Triangle Ops, Attention	Alg 7-11
Diffusion	Diffusion Module, Transformer, Fourier	Alg 12-15
Confidence & Affinity	Confidence, Distogram, Affinity (Boltz-2)	Alg 16-18
Loss Functions	Diffusion Loss, Confidence Loss	Alg 19-20

Boltz Source Code Submodule

# Official Boltz Repository
Boltz-Ref-src/boltz-official/

Papers:

📓 Boltz-2 Specific Notebooks (NEW!)

Boltz-2 introduces binding affinity prediction - the first DL model approaching FEP accuracy while being 1000x faster.

👉 Boltz-2 Algorithm Index

Boltz-2 New Features

Category	Key Algorithms	Notebooks
Affinity Prediction	Affinity Module, Gaussian Smearing	Alg 1-2
Contact Guidance	Contact Conditioning	Alg 3
Enhanced v2 Modules	Input v2, Template v2, Diffusion v2	Alg 5-7
Improved Confidence	Confidence v2, B-Factor	Alg 8, 10

Boltz-2 Submodules

# Official Repository (contains both Boltz-1 and Boltz-2)
Boltz-Ref-src/boltz-official/

# Boltzina - Virtual Screening with Boltz-2
Boltz-Ref-src/boltzina/

Practice on Modeling Test of AF2

https://github.com/sokrypton/ColabFold.git

MD+Alphafold2

https://github.com/pablo-arantes/Making-it-rain

🔧 Fine-tuning Framework (NEW!)

We provide a comprehensive fine-tuning framework for adapting protein structure prediction models to downstream tasks.

👉 Full Fine-tuning Guide

Supported Models

Model	Framework	Fine-tuning Support
AlphaFold2	JAX/Haiku	✅ Full, Head-only, LoRA
AlphaFold3	JAX/Haiku	✅ Full, Head-only, LoRA
Boltz-1	PyTorch	✅ Full, LoRA, Adapter
Boltz-2	PyTorch	✅ Full, LoRA, Adapter

Fine-tuning Strategies

Strategy	Trainable Params	Use Case
LoRA	~0.1%	Small datasets, efficient fine-tuning
Adapter	~1%	Modular, multiple tasks
Head-only	~5%	New prediction tasks
Full	100%	Large datasets, maximum performance

Supported Tasks (50+ Task Types)

We support comprehensive task coverage inspired by production platforms like ProteinBase.com:

💊 Drug Discovery

Task	Outputs	Applications
Binding Affinity	pKd, pIC50, ΔG, Ki	Lead optimization, SAR
Virtual Screening	Hit probability, ranking	HTS prioritization
ADMET	Absorption, metabolism, toxicity	Compound triage

🔬 Protein Engineering

Task	Outputs	Applications
Stability	ΔΔG, Tm shift	Thermostabilization
Solubility	Expression score	Biomanufacturing
Mutation Effects	Fitness, pathogenicity	Variant analysis

🧫 Antibody Design

Task	Outputs	Applications
Affinity Maturation	CDR binding, ΔΔG	Therapeutic optimization
Humanization	Humanness score	Drug development
Developability	Aggregation, viscosity	Manufacturing

⚗️ Enzyme Engineering

Task	Outputs	Applications
Activity	kcat, Km, kcat/Km	Catalyst design
Specificity	Substrate profiles	Industrial enzymes
Directed Evolution	Fitness landscapes	Protein engineering

🔗 Protein-Protein Interactions

Task	Outputs	Applications
PPI Binding	Kd, interface stability	Complex analysis
Interface Prediction	Contact residues	Structure analysis
Hot Spot Detection	ΔΔG per residue	PPI drug targets

🧬 Function Prediction

Task	Outputs	Applications
GO Terms	MF, BP, CC	Annotation
EC Numbers	Enzyme classification	Function discovery
Localization	Subcellular compartment	Systems biology

🛡️ Immunology

Task	Outputs	Applications
B-cell Epitopes	Epitope probability	Vaccine design
T-cell Epitopes	MHC binding	Immunotherapy
Immunogenicity	ADA risk	Drug safety

📊 Structure Quality

Task	Outputs	Applications
Confidence	pLDDT, pAE, pTM	Model validation
Disorder	IDR prediction	Structure analysis
Contacts	Distance maps	Validation

Quick Start

from finetuning import TaskRegistry, create_finetuning_pipeline
from finetuning.modules import LoRAModule
from finetuning.heads import AffinityHead

# Option 1: Use Task Registry (Recommended)
# List all available tasks
print(TaskRegistry.list_all_tasks())  # 50+ tasks

# Get task info and recommendations
info = TaskRegistry.get_task_info("binding_affinity")
print(f"Recommended LoRA rank: {info.recommended_rank}")

# Create pipeline automatically
pipeline = create_finetuning_pipeline(
    task="binding_affinity",
    base_model=model,
    strategy="lora",
)

# Option 2: Manual Setup
from finetuning import FineTuningConfig, Trainer

# 1. Load pretrained model
model = load_pretrained_boltz2()

# 2. Apply LoRA (only ~0.1% parameters trainable)
lora_model = LoRAModule(model, rank=8, alpha=16.0)

# 3. Add task-specific head
affinity_head = AffinityHead(AffinityHeadConfig())

# 4. Train
config = FineTuningConfig(
    strategy="lora",
    task="binding_affinity",
    lora_rank=8,
)
trainer = Trainer(lora_model, config, train_loader, val_loader)
trainer.train()

# 5. Save lightweight LoRA weights
lora_model.save_lora_weights("./lora_weights.pt")

Module Overview

finetuning/
├── configs/           # Configuration classes
│   ├── base_config.py      # FineTuningConfig, ModelConfig, TrainingConfig
│   ├── lora_config.py      # LoRA-specific configuration
│   └── task_config.py      # 25+ task configurations (ProteinBase-style)
├── modules/           # Fine-tuning modules
│   ├── lora.py             # LoRA implementation (PyTorch & JAX)
│   ├── adapter.py          # Adapter modules
│   └── prompt_tuning.py    # Prompt tuning
├── heads/             # Task-specific prediction heads (15+ specialized heads)
│   ├── affinity_head.py    # Binding affinity (Boltz-2 style)
│   ├── property_head.py    # Protein property prediction
│   ├── contact_head.py     # Contact prediction
│   ├── antibody_head.py    # Affinity maturation, humanization, developability
│   ├── ppi_head.py         # PPI binding, interface, hot spots
│   ├── enzyme_head.py      # Activity, specificity, evolution
│   ├── function_head.py    # GO terms, EC numbers, localization
│   └── epitope_head.py     # B-cell, T-cell epitopes, immunogenicity
├── trainers/          # Training utilities
│   ├── trainer.py          # Main trainer class
│   ├── distributed_trainer.py  # Multi-GPU training
│   └── callbacks.py        # Training callbacks (EarlyStopping, Wandb, etc.)
├── data/              # Data utilities
│   ├── datasets.py         # 10+ dataset classes for all task types
│   └── transforms.py       # Data augmentation (rotation, MSA dropout)
├── examples/          # Tutorial notebooks
│   └── finetuning_tutorial.ipynb  # Complete walkthrough
├── registry.py        # Task registry and factory pattern
└── utils/             # Utility functions
    ├── checkpoint.py       # Model checkpointing
    └── metrics.py          # Evaluation metrics (lDDT, TM-score, AUROC, etc.)

Blogs

References

reference papers

Reference papers list here and you can download them by Baidu Cloud Driver Link with the code 9w2p.
Reference Papers' Source Codes are managed via git submodules in AF2-Ref-src/

📦 AlphaFold2 Reference Source Code (Submodules)

# Official AlphaFold (DeepMind)
AF2-Ref-src/alphafold-official/

# OpenFold (PyTorch implementation)
AF2-Ref-src/openfold/

# ColabFold (Colab-friendly version)
AF2-Ref-src/colabfold/

# MMseqs2 (Sequence search)
AF2-Ref-src/mmseqs2/

# HH-suite (Template search)
AF2-Ref-src/hh-suite/

# trRosetta2 (Predecessor model)
AF2-Ref-src/trRosetta2/

# ESM (Facebook protein language model)
AF2-Ref-src/esm/

# UniRep (Protein representations)
AF2-Ref-src/unirep/

# SeqVec (Sequence embeddings)
AF2-Ref-src/seqvec/

To initialize submodules after cloning:

git submodule update --init --recursive

Data availability

All input data are freely available from public sources.

Structures from the PDB were used for training and as templates (https://www.wwpdb.org/ftp/pdb-ftp-sites; for the associated sequence data and 40% sequence clustering see also https://ftp.wwpdb.org/pub/pdb/derived_data/ and https://cdn.rcsb.org/resources/sequence/clusters/bc-40.out).

Training used a version of the PDB downloaded 28/08/2019, while CASP14 template search used a version downloaded 14/05/2020. Template search also used the PDB70 data- base, downloaded 13/05/2020 (https://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/).

We show experimental structures from the PDB with accessions 6Y4F⁷⁶, 6YJ1⁷⁷, 6VR4⁷⁸, 6SK0⁷⁹, 6FES⁸⁰, 6W6W⁸¹, 6T1Z⁸², and 7JTL⁸³.

For MSA lookup at both training and prediction time,

we used UniRef90 v2020_01 (https://ftp.ebi.ac.uk/pub/databases/uniprot/previous_releases/release-2020_01/uniref/),

BFD (https://bfd.mmseqs.com), Uniclust30 v2018_08 (https://wwwuser.gwdg.de/~compbiol/uniclust/2018_08/),

and MGnify clusters v2018_12 (https://ftp.ebi.ac.uk/pub/databases/metagenomics/peptide_database/2018_12/). Uniclust30 v2018_08 was further used as input for constructing a distillation structure dataset.

Code and programmings availability

Source code

for the AlphaFold model, trained weights, and an inference script is available under an open-source license at https://github.com/deepmind/alphafold.

Neural networks

Neural networks were developed with

TensorFlow v1 (https://github.com/tensorflow/tensorflow),
Sonnet v1 (https://github.com/deepmind/sonnet),
JAX v0.1.69 (https://github.com/google/jax/),
Haiku v0.0.4 (https://github.com/deepmind/dm-haiku).

MSA search

For MSA search on

UniRef90, MGnify clusters, and reduced BFD we used jackhmmer and for template search on the PDB SEQRES we used
hmmsearch, both from HMMER v3.3 (http://eddylab.org/soft-ware/hmmer/).

For template search against PDB70, we used HHsearch from HH-suite v3.0-beta.3 14/07/2017 (https://github.com/soedinglab/hh-suite). For constrained relaxation of structures, we used OpenMM v7.3.1 (https://github.com/openmm/openmm) with the Amber99sb force field.

Docking analysis

Docking analysis on DGAT used

P2Rank v2.1 (https://github.com/rdk/p2rank),
MGLTools v1.5.6 (https://ccsb.scripps.edu/mgltools/)
and AutoDockVina v1.1.2 (http://vina.scripps.edu/download/) on a workstation running Debian GNU/Linux rodete 5.10.40-1rodete1-amd64 x86_64.

Data analysis

Data analysis used

Python v3.6 (https://www.python.org/),
NumPy v1.16.4 (https://github.com/numpy/numpy),
SciPy v1.2.1 (https://www.scipy.org/),
seaborn v0.11.1 (https://github.com/mwaskom/seaborn),
scikit-learn v0.24.0 (https://github.com/scikit-learn/),
Matplotlib v3.3.4 (https://github.com/matplotlib/matplotlib),
pandas v1.1.5 (https://github.com/pandas-dev/pandas),
and Colab (https://research.google.com/colaboratory).
TM-align v20190822 (https://zhanglab.dcmb.med.umich.edu/TM-align) was used for computing TM-scores.

Structure analysis

Structure analysis used Pymol v2.3.0 (https://github.com/schrodinger/pymol-open-source).

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
alphafold2		alphafold2
alphafold3		alphafold3
assets/images/algorithms		assets/images/algorithms
boltz		boltz
boltz2		boltz2
finetuning		finetuning
.gitmodules		.gitmodules
BLOG_POST.md		BLOG_POST.md
README.md		README.md
TODOLIST.md		TODOLIST.md

chenxingqiang/alphafold-notebooks

Folders and files

Latest commit

History

Repository files navigation

A reference of 'AlphaFold2 Codec' include everything of AlphaFold2.

Learning Source Availability

Papers

PPT

Learning by Code

📓 AlphaFold2 Algorithm Notebooks (32 Complete!)

Quick Links by Category

📓 AlphaFold3 Algorithm Notebooks (NEW!)

Key AF3 Components

AF3 Source Code Submodules

📓 Boltz Algorithm Notebooks (NEW!)

Key Boltz Components

Boltz Source Code Submodule

📓 Boltz-2 Specific Notebooks (NEW!)

Boltz-2 New Features

Boltz-2 Submodules

Practice on Modeling Test of AF2

MD+Alphafold2

🔧 Fine-tuning Framework (NEW!)

Supported Models

Fine-tuning Strategies

Supported Tasks (50+ Task Types)

Quick Start

Module Overview

Blogs

References

reference papers

📦 AlphaFold2 Reference Source Code (Submodules)

Data availability

Code and programmings availability

Source code

Neural networks

MSA search

Docking analysis

Data analysis

Structure analysis

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages