From f8bbd279ee724189c768722b572db8209bcee122 Mon Sep 17 00:00:00 2001 From: Andrew Reid Date: Tue, 13 Aug 2024 16:49:42 -0400 Subject: [PATCH 1/4] Draft LLNL blog post. --- .../08/2024-08-13-llnl-workshop-blog-post.md | 73 +++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 _posts/2024/08/2024-08-13-llnl-workshop-blog-post.md diff --git a/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md new file mode 100644 index 000000000..c9c165278 --- /dev/null +++ b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md @@ -0,0 +1,73 @@ +--- +layout: page +authors: ["Andrew Reid", Trevor Keller", "Jane Herriman"] +teaser: "We ran the full user workshop at LLNL!" +title: "HPC Carpentry at LLNL" +date: 2024-08-13 +time: "09:00:00" +tags: ["HPC Carpentry", "Lesson Program Implementation"] +--- + +## HPC Carpentry at LLNL + +In the first week of June, 2024, instructors from [HPC Carpentry][hpcc] +taught our full workflow workshop for the first time, not once but twice, +over a four-day stint at the Lawrence Livermore National Laboratory. + +It was immensely rewarding to see all this material come together in +one place, and I think we served our learners pretty well, and learned +a few lessons relevant to future workshops. + +### Workshop structure + +Each workshop ran over two days. On the first day, we did the [Unix Shell +intro lesson][shell] from Software Carpentry in the morning, and our own +[HPC Intro][hpcc] lesson in the afternoon. On the second day, we did a +variant of the [workflow lesson][work], adapted for the Maestro workflow +tool (rather than the Snakemake default), which is developed at LLNL. + +The instructor team consisted of Andrew Reid and Trevor Keller from +the HPC Carpentry steering committee, and Jane Herriman from LLNL, +along with helpers from the LLNL community. + +### Learners + +Learners had a range of backgrounds, but lessons generally went +at a slightly faster pace than expected, without leaving anyone +behind. This was in part because access to the LLNL system was by means +of pre-authorized RSA tokens, removing a lot of the friction from the +initial connection process that has been time-consuming in other versions +of the workshop. + +### Lesson Feedback + +One major take-away is that the workflow lesson in particular is +vulnerable to learners losing the thread if they miss a step. This lesson, +in either its Maestro or Snakemake version, builds up an increasingly +sophisticated workflow specification file, incrementally demonstrating +workflow concepts in the context of the tool. Consequently, a learner +who misses a step and falls behind can find themselves unable to recover, +since the remainder of the lesson builds on precisely the content that was +missed. The workflow lesson differs in this respect from the shell intro +or HPC Intro lesson, where later steps can better stand on their own. + +The solution to this, which we already started to implement for the +second workshop, was to have a shared on-line notepad with "checkpoint" +versions of the file, to which learners can refer if they fall behind, +with helpers bridging the content gap for them. + +The hands-on Carpentries approach proved itself once again, building +muscle memory and vocabulary in learners, who could then move on to their +LLNL summer research projects with greater confidence in their ability +to productively use the shared high-performance computing resources. + +For the project, it was confirmation that the HPC User workshop can +work, including the valuable feedback about checkpoint files and a +shared notepad. We look forward to teaching this workshop more, and +getting it out of beta status and into our main curriculum. + + +[hpcc]: https://hpc-carpentry.org/ +[shell]: https://swcarpentry.github.io/shell-novice +[intro]: https://carpentries-incubator.github.io/hpc-intro/ +[work]: https://carpentries-incubator.github.io/hpc-workflows/ From 0dc10d2eed358b214a3bbdc59e18f06856d3ae0f Mon Sep 17 00:00:00 2001 From: Trevor Keller Date: Tue, 13 Aug 2024 17:36:08 -0400 Subject: [PATCH 2/4] Some details from T's terrible memory --- .../08/2024-08-13-llnl-workshop-blog-post.md | 38 ++++++++++++++----- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md index c9c165278..259e3d877 100644 --- a/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md +++ b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md @@ -4,7 +4,7 @@ authors: ["Andrew Reid", Trevor Keller", "Jane Herriman"] teaser: "We ran the full user workshop at LLNL!" title: "HPC Carpentry at LLNL" date: 2024-08-13 -time: "09:00:00" +time: "12:00:00" tags: ["HPC Carpentry", "Lesson Program Implementation"] --- @@ -18,26 +18,43 @@ It was immensely rewarding to see all this material come together in one place, and I think we served our learners pretty well, and learned a few lessons relevant to future workshops. -### Workshop structure +### Workshop Structure Each workshop ran over two days. On the first day, we did the [Unix Shell -intro lesson][shell] from Software Carpentry in the morning, and our own +intro][shell] lesson from Software Carpentry in the morning, and our own [HPC Intro][hpcc] lesson in the afternoon. On the second day, we did a variant of the [workflow lesson][work], adapted for the Maestro workflow -tool (rather than the Snakemake default), which is developed at LLNL. +tool (rather than Snakemake), because it is developed and widely used at LLNL. The instructor team consisted of Andrew Reid and Trevor Keller from the HPC Carpentry steering committee, and Jane Herriman from LLNL, along with helpers from the LLNL community. +#### Maestro + +Maestro is a capable workflow engine, and one we would not have explored had +Jane not ported the Snakemake lesson so expertly. Maestro favors +reproducibility, running every step of the task from scratch at every +invocation. This is a significant difference from Snakemake which, like Make, +does not re-execute completed "targets." A significant benefit of Maestro is +that the tool does not persist while jobs execute: it generates and submits +native Slurm jobs, with tooling in place to check the status of running +workflows. This is much more HPC-compatible, for large-scale or time-consuming +jobs. + ### Learners -Learners had a range of backgrounds, but lessons generally went +Learners had a range of backgrounds, from undergraduate bio-informatics +students to experienced Linux HPC users. The lessons generally went at a slightly faster pace than expected, without leaving anyone behind. This was in part because access to the LLNL system was by means of pre-authorized RSA tokens, removing a lot of the friction from the initial connection process that has been time-consuming in other versions -of the workshop. +of the workshop. The instructors live-coded plenty of mistakes, opening +discussions on some interesting tangential topics. LLNL runs a pool of "login +nodes," rather than a single machine, which made for interesting, early +discussion of networked filesystems. The sheer number of machines also made the +output of `sinfo` tricky to comprehend at-a-glance, which is awesome. ### Lesson Feedback @@ -48,13 +65,15 @@ sophisticated workflow specification file, incrementally demonstrating workflow concepts in the context of the tool. Consequently, a learner who misses a step and falls behind can find themselves unable to recover, since the remainder of the lesson builds on precisely the content that was -missed. The workflow lesson differs in this respect from the shell intro -or HPC Intro lesson, where later steps can better stand on their own. +missed. The Workflow lesson differs in this respect from the Shell and +HPC intro lessons, where later steps can better stand on their own. The solution to this, which we already started to implement for the second workshop, was to have a shared on-line notepad with "checkpoint" versions of the file, to which learners can refer if they fall behind, -with helpers bridging the content gap for them. +with helpers bridging the content gap for them. Also, LLNL supports and +uses the [`give`][give] tool, allowing users to easily pass files around: +it's nifty! The hands-on Carpentries approach proved itself once again, building muscle memory and vocabulary in learners, who could then move on to their @@ -68,6 +87,7 @@ getting it out of beta status and into our main curriculum. [hpcc]: https://hpc-carpentry.org/ +[give]: https://github.com/hpc/give [shell]: https://swcarpentry.github.io/shell-novice [intro]: https://carpentries-incubator.github.io/hpc-intro/ [work]: https://carpentries-incubator.github.io/hpc-workflows/ From 8b87c88a3d47b6c7e0467e4c8e15fa60e413742e Mon Sep 17 00:00:00 2001 From: Trevor Keller Date: Tue, 13 Aug 2024 18:04:58 -0400 Subject: [PATCH 3/4] tmux! --- .../08/2024-08-13-llnl-workshop-blog-post.md | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md index 259e3d877..32ecbd173 100644 --- a/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md +++ b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md @@ -16,13 +16,14 @@ over a four-day stint at the Lawrence Livermore National Laboratory. It was immensely rewarding to see all this material come together in one place, and I think we served our learners pretty well, and learned -a few lessons relevant to future workshops. +a few lessons relevant to future workshops. Traveling to teach in person, +while not without hiccups, was extremely worthwhile. ### Workshop Structure Each workshop ran over two days. On the first day, we did the [Unix Shell intro][shell] lesson from Software Carpentry in the morning, and our own -[HPC Intro][hpcc] lesson in the afternoon. On the second day, we did a +[HPC Intro][intro] lesson in the afternoon. On the second day, we did a variant of the [workflow lesson][work], adapted for the Maestro workflow tool (rather than Snakemake), because it is developed and widely used at LLNL. @@ -30,6 +31,13 @@ The instructor team consisted of Andrew Reid and Trevor Keller from the HPC Carpentry steering committee, and Jane Herriman from LLNL, along with helpers from the LLNL community. +While split-terminal tools exist, we used vanilla [tmux][tmux] with two +terminals attached to the same session. This allowed the instructors to type on +their own laptop, with the lesson webpage alongside, while learners followed +along on the enhanced terminal displayed at the front of the room. Note: +to "scroll up" in `tmux`, press Ctrl+b, [, +then arrow-key around. + #### Maestro Maestro is a capable workflow engine, and one we would not have explored had @@ -86,8 +94,9 @@ shared notepad. We look forward to teaching this workshop more, and getting it out of beta status and into our main curriculum. -[hpcc]: https://hpc-carpentry.org/ [give]: https://github.com/hpc/give +[hpcc]: https://hpc-carpentry.org/ +[intro]: https://hpc-workshops.github.io/llnl-hpc-intro/ [shell]: https://swcarpentry.github.io/shell-novice -[intro]: https://carpentries-incubator.github.io/hpc-intro/ -[work]: https://carpentries-incubator.github.io/hpc-workflows/ +[tmux]: https://github.com/tmux/tmux/wiki +[work]: https://xorjane.github.io/maestro-workflow-lesson/ From ce137a7b3fbe6d3319cccd3f634a3d36dd31592d Mon Sep 17 00:00:00 2001 From: Jane Herriman Date: Tue, 13 Aug 2024 16:49:42 -0700 Subject: [PATCH 4/4] Update 2024-08-13-llnl-workshop-blog-post.md --- .../08/2024-08-13-llnl-workshop-blog-post.md | 37 ++++++++++--------- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md index 32ecbd173..0fa9f7e24 100644 --- a/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md +++ b/_posts/2024/08/2024-08-13-llnl-workshop-blog-post.md @@ -11,13 +11,14 @@ tags: ["HPC Carpentry", "Lesson Program Implementation"] ## HPC Carpentry at LLNL In the first week of June, 2024, instructors from [HPC Carpentry][hpcc] -taught our full workflow workshop for the first time, not once but twice, -over a four-day stint at the Lawrence Livermore National Laboratory. +taught our full workflow workshop for the first time. Over a four-day +stint at Lawrence Livermore National Laboratory, we delivered this +content not once, but twice! It was immensely rewarding to see all this material come together in -one place, and I think we served our learners pretty well, and learned -a few lessons relevant to future workshops. Traveling to teach in person, -while not without hiccups, was extremely worthwhile. +one place. Traveling to teach in person, while not without hiccups, was +extremely worthwhile. We believe we served our learners pretty well, and +we learned a few lessons relevant to future workshops. ### Workshop Structure @@ -25,7 +26,7 @@ Each workshop ran over two days. On the first day, we did the [Unix Shell intro][shell] lesson from Software Carpentry in the morning, and our own [HPC Intro][intro] lesson in the afternoon. On the second day, we did a variant of the [workflow lesson][work], adapted for the Maestro workflow -tool (rather than Snakemake), because it is developed and widely used at LLNL. +tool (rather than Snakemake), because it is developed and used at LLNL. The instructor team consisted of Andrew Reid and Trevor Keller from the HPC Carpentry steering committee, and Jane Herriman from LLNL, @@ -33,10 +34,10 @@ along with helpers from the LLNL community. While split-terminal tools exist, we used vanilla [tmux][tmux] with two terminals attached to the same session. This allowed the instructors to type on -their own laptop, with the lesson webpage alongside, while learners followed -along on the enhanced terminal displayed at the front of the room. Note: -to "scroll up" in `tmux`, press Ctrl+b, [, -then arrow-key around. +their own laptop while referencing the lesson webpage and selectively sharing +the terminal. Learners followed along on the enhanced terminal displayed at the +front of the room. Note: to "scroll up" in `tmux`, press +Ctrl+b, [, then arrow-key around. #### Maestro @@ -55,14 +56,14 @@ jobs. Learners had a range of backgrounds, from undergraduate bio-informatics students to experienced Linux HPC users. The lessons generally went at a slightly faster pace than expected, without leaving anyone -behind. This was in part because access to the LLNL system was by means -of pre-authorized RSA tokens, removing a lot of the friction from the -initial connection process that has been time-consuming in other versions -of the workshop. The instructors live-coded plenty of mistakes, opening +behind. This was in part because access to LLNL's system `Ruby` was by means +of pre-authorized RSA tokens, removing a lot of the friction +from the initial connection process that has been time-consuming in other +versions of the workshop. The instructors live-coded plenty of mistakes, opening discussions on some interesting tangential topics. LLNL runs a pool of "login -nodes," rather than a single machine, which made for interesting, early -discussion of networked filesystems. The sheer number of machines also made the -output of `sinfo` tricky to comprehend at-a-glance, which is awesome. +nodes" per HPC system, rather than a single machine, which made for interesting, +early discussion of networked filesystems. The sheer number of nodes also made +the output of `sinfo` tricky to comprehend at-a-glance, which is awesome. ### Lesson Feedback @@ -77,7 +78,7 @@ missed. The Workflow lesson differs in this respect from the Shell and HPC intro lessons, where later steps can better stand on their own. The solution to this, which we already started to implement for the -second workshop, was to have a shared on-line notepad with "checkpoint" +second workshop, was to have a shared online notepad with "checkpoint" versions of the file, to which learners can refer if they fall behind, with helpers bridging the content gap for them. Also, LLNL supports and uses the [`give`][give] tool, allowing users to easily pass files around: