Skip to content

Commit da42645

Browse files
authored
Update README.md
1 parent 8add243 commit da42645

File tree

1 file changed

+77
-22
lines changed

1 file changed

+77
-22
lines changed

README.md

Lines changed: 77 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,88 @@
1-
This Workflow allows separate execution of the CPU & GPU steps. It also distributes the inference runs accross multiple GPU devices using GNU parallel.
2-
1. Build the singularity container that supports parallel inference runs using the following command:
3-
singularity build alphafold3_parallel.sif docker://ntnn19/alphafold3:latest_parallel_a100_40gb
4-
<number_of_inference_job_lists> should be set to 1 for local runs and n for slurm runs, where n is the number of nodes with GPU
5-
2. Download alphafold3 databases and obtain the weights
1+
# AlphaFold3 Workflow
2+
3+
This workflow supports separate execution of the **CPU** and **GPU** steps. It also distributes inference runs across multiple GPU devices using **GNU parallel**.
64

7-
The following steps assume that you are located in the project directory.
5+
## Steps to setup & execute
86

9-
3. Clone this repo to your project directory. It must follow the following structure after cloning:
7+
### 1. Build the Singularity Container
108

11-
![image](https://github.com/user-attachments/assets/18bb634a-fa2d-41a0-b3a9-e55b72c7fb6a)
9+
Run the following command to build the Singularity container that supports parallel inference runs:
10+
11+
```bash
12+
singularity build alphafold3_parallel.sif docker://ntnn19/alphafold3:latest_parallel_a100_40gb
13+
```
1214

15+
**Notes**
16+
- Set <number_of_inference_job_lists> to 1 for local runs.
17+
- For SLURM runs, set <number_of_inference_job_lists> to n, where n is the number of nodes with GPUs.
18+
- Make sure to download the required [AlphaFold3 databases](https://github.com/google-deepmind/alphafold3/blob/main/docs/installation.md#obtaining-genetic-databases) and [weights](https://forms.gle/svvpY4u2jsHEwWYS6) before proceeding.
1319

14-
An example json file can be found in this repo under example/example.json
20+
### 2. Clone This Repository
1521

16-
4. Create & activate snakemake environment:
22+
Clone this repository into your project directory. After cloning, your project structure should look like this:
1723

18-
Install mamba/micromamba
24+
```bash
25+
. <-- This represents your current location
26+
├── dataset_1
27+
│   ├── af_input
28+
│   ├── data_pipeline
29+
│     └── <your_input_json_file>
30+
├── example
31+
│   └── example.json
32+
├── README.md
33+
└── workflow
34+
├── scripts
35+
│   ├── create_job_list.py
36+
│   ├── parallel.sh
37+
│   └── split_json_and_create_job_list.py
38+
├── Snakefile
39+
```
40+
An example JSON file is available in the example/ directory:
41+
example/example.json
1942

20-
mamba create env -p $(pwd)/env -f environment.yml
43+
### 3. Create and Activate the Snakemake Environment
2144

22-
mamba activate $(pwd)/env
45+
Install mamba or micromamba if not already installed. Then, set up and activate the environment using the following commands:
46+
```bash
47+
mamba create -p $(pwd)/env -f environment.yml
48+
```
49+
```bash
50+
mamba activate $(pwd)/env
51+
```
2352

24-
6. Run the workflow
25-
# Dry local run
26-
snakemake --use-singularity --config af3_container=<path_to_your_alphafold3_container> --singularity-args \'--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases\' -c all --set-scatter split=<number_of_inference_job_lists> -n
27-
# Dry run with slurm
28-
snakemake --use-singularity --config af3_container=<path_to_your_alphafold3_container> --singularity-args '\--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases\' -j 99 --executor slurm --set-scatter split==<number_of_inference_job_lists> -n
53+
### 4. Run the Workflow
54+
**Dry run (local)**
55+
```bash
56+
snakemake --use-singularity \
57+
--config af3_container=<path_to_your_alphafold3_container> \
58+
--singularity-args '--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases' \
59+
-c all \
60+
--set-scatter split=<number_of_inference_job_lists> -n
61+
```
62+
**Dry run (slurm)**
63+
```bash
64+
snakemake --use-singularity \
65+
--config af3_container=<path_to_your_alphafold3_container> \
66+
--singularity-args '--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases' \
67+
-j 99 \
68+
--executor slurm \
69+
--set-scatter split=<number_of_inference_job_lists> -n
70+
```
71+
**Local run**
72+
```bash
73+
snakemake --use-singularity \
74+
--config af3_container=<path_to_your_alphafold3_container> \
75+
--singularity-args '--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases' \
76+
-c all \
77+
--set-scatter split=<number_of_inference_job_lists>
78+
```
2979

30-
# Local run
31-
snakemake --use-singularity --config af3_container=<path_to_your_alphafold3_container> --singularity-args \'--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases\' -c all --set-scatter split=<number_of_inference_job_lists>
32-
# Run with slurm
33-
snakemake --use-singularity --config af3_container=<path_to_your_alphafold3_container> --singularity-args \'--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases\' -j 99 --executor slurm --set-scatter split==<number_of_inference_job_lists>
80+
**slurm run**
81+
```bash
82+
snakemake --use-singularity \
83+
--config af3_container=<path_to_your_alphafold3_container> \
84+
--singularity-args '--nv -B <alphafold3_weights_dir>:/root/models -B $(pwd)/<dataset_directory>/af_input:/root/af_input -B $(pwd)/<dataset_directory>/af_output:/root/af_output -B <path_to_alphafold3_db_directory>:/root/public_databases' \
85+
-j 99 \
86+
--executor slurm \
87+
--set-scatter split=<number_of_inference_job_lists> -n
88+
```

0 commit comments

Comments
 (0)