Skip to content

Commit 7f3bab0

Browse files
authored
Add files via upload
1 parent efab946 commit 7f3bab0

File tree

1 file changed

+47
-11
lines changed

1 file changed

+47
-11
lines changed

notebooks/AWS-ParallelCluster.ipynb

Lines changed: 47 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@
77
"source": [
88
"# Snakemake on AWS ParallelCluster \n",
99
"\n",
10-
"**Difficulty Level: Intermediate**\n",
11-
"\n",
1210
"## Introduction\n",
1311
"\n",
1412
"### AWS ParallelCluster \n",
@@ -290,6 +288,39 @@
290288
"```"
291289
]
292290
},
291+
{
292+
"cell_type": "markdown",
293+
"id": "5e062fca",
294+
"metadata": {
295+
"vscode": {
296+
"languageId": "plaintext"
297+
}
298+
},
299+
"source": [
300+
"### Snakemake workflow \n",
301+
"\n",
302+
"When running a Snakemake workflow, it is common to organize the workflow dependencies in a specific structure. \n",
303+
"```bash\n",
304+
"Project Folder\n",
305+
"\n",
306+
"├── Snakefile\n",
307+
"\n",
308+
"├── config.yml\n",
309+
"\n",
310+
"├── environment.yml\n",
311+
"\n",
312+
"├── data\n",
313+
"│ ├── file 1\n",
314+
"│ └── file 2...\n",
315+
"```\n",
316+
"**Snakefile:** A Snakefile is the main file used in Snakemake to define a workflow. The commands to be executed, the input and output files and the dependencies of each step are defined as rules in this file. This file must be present in the working directory; if named Snakefile, Snakemake will automatically recognize it as the workflow definition file. If named differently, you must use the -s flag to specify the file. \n",
317+
"\n",
318+
"**config.yaml:** The config.yaml file is used to store configuration parameters that can be easily accessed and utilized throughout the workflow. This file allows you to define various settings, paths, parameters, and other variables that your Snakemake rules might need.\n",
319+
"\n",
320+
"**environment.yml:** The environment.yml file defines the software environment required to run the Snakemake workflow include package names and versions. \n",
321+
"\n"
322+
]
323+
},
293324
{
294325
"cell_type": "markdown",
295326
"id": "9942bfc7",
@@ -299,9 +330,7 @@
299330
"\n",
300331
"When submitting a Snakemake workflow to the slurm scheduler integrated in AWS ParallelCluster, you can submit the workflow by using the `--executor` flag and specifying the `pcluster-slurm` plugin.\n",
301332
"\n",
302-
"1. Create a Snakefile within a project directory\n",
303-
"\n",
304-
"**Snakefile** A Snakefile is the main file used in Snakemake to define a workflow. The commands to be executed, the input and output files and the dependencies of each step are defined as rules in this file. "
333+
"1. Create a Snakefile within a project directory\n"
305334
]
306335
},
307336
{
@@ -415,10 +444,21 @@
415444
"## Submitting a bioinformatics Snakemake workflow to the Slurm cluster\n",
416445
"\n",
417446
"In this example, we will use Snakemake and the pcluster-slurm plugin to run a Bioinformatics pipeline. \n",
418-
"\n",
447+
"\n"
448+
]
449+
},
450+
{
451+
"cell_type": "markdown",
452+
"id": "0d0d944f",
453+
"metadata": {
454+
"vscode": {
455+
"languageId": "plaintext"
456+
}
457+
},
458+
"source": [
419459
"### Download the input data\n",
420460
"\n",
421-
"The input data consists of raw fastq files. Use the `curl` command to download the data from a public NIGMS google storage bucket. "
461+
"The input data consists of raw fastq files. Use the curl command to download the data from a public NIGMS google storage bucket."
422462
]
423463
},
424464
{
@@ -452,8 +492,6 @@
452492
"source": [
453493
"### Create an `environment.yml` file\n",
454494
"\n",
455-
"**environment.yml:** The environment.yml file defines the software environment required to run the Snakemake workflow include package names and versions. \n",
456-
"\n",
457495
"```bash\n",
458496
"vi environment.yml\n",
459497
"```"
@@ -487,8 +525,6 @@
487525
"source": [
488526
"### Create a `config.yaml` file\n",
489527
"\n",
490-
"**config.yaml** The config.yaml file is used to store configuration parameters that can be easily accessed and utilized throughout the workflow. This file allows you to define various settings, paths, parameters, and other variables that your Snakemake rules might need.\n",
491-
"\n",
492528
"```bash\n",
493529
"vi config.yaml\n",
494530
"```"

0 commit comments

Comments
 (0)