From 6e673bd90b9f59ddfe2f4332372f7926bc5a0a72 Mon Sep 17 00:00:00 2001
From: Stefan Janssen <sjanssen2@users.noreply.github.com>
Date: Wed, 14 Feb 2024 15:40:28 +0100
Subject: [PATCH 1/6] extend install instructions to also install g++ (#3354)

* Update CHANGELOG.md

* Update INSTALL.md

Explicitly add g++ as install dependency, as I recently ran into issues with missing limit.h header files. This issue was because g++ was not available, only gcc.

---------

Co-authored-by: Antonio Gonzalez <antgonza@gmail.com>
---
 CHANGELOG.md | 2 +-
 INSTALL.md   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 1756c7238..24ad9a78a 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -13,7 +13,7 @@ Deployed on January 8th, 2024
 * Workflow definitions can now use sample or preparation information columns/values to differentiate between them.
 * Updated the Adapter and host filtering plugin (qp-fastp-minimap2) to v2023.12 addressing a bug in adapter filtering; [more information](https://qiita.ucsd.edu/static/doc/html/processingdata/qp-fastp-minimap2.html).
 * Other fixes: [3334](https://github.com/qiita-spots/qiita/pull/3334), [3338](https://github.com/qiita-spots/qiita/pull/3338). Thank you @sjanssen2.
-* The internal Sequence Processing Pipeline is now using the human pan-genome reference, together with the GRCh38 genome + PhiX and CHM13 genome for human host filtering.
+* The internal Sequence Processing Pipeline is now using the human pan-genome reference, together with the GRCh38 genome + PhiX and T2T-CHM13v2.0 genome for human host filtering.
 
 
 Version 2023.10
diff --git a/INSTALL.md b/INSTALL.md
index f23e85f38..6d129c06e 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -162,9 +162,9 @@ Navigate to the cloned directory and ensure your conda environment is active:
 cd qiita
 source activate qiita
 ```
-If you are using Ubuntu or a Windows Subsystem for Linux (WSL), you will need to ensure that you have a C++ compiler and that development libraries and include files for PostgreSQL are available. Type `cc` into your system to ensure that it doesn't result in `program not found`. The following commands will install a C++ compiler and  `libpq-dev`:
+If you are using Ubuntu or a Windows Subsystem for Linux (WSL), you will need to ensure that you have a C++ compiler and that development libraries and include files for PostgreSQL are available. Type `cc` into your system to ensure that it doesn't result in `program not found`. If you use the the GNU Compiler Collection, make sure to have `gcc` and `g++` available. The following commands will install a C++ compiler and  `libpq-dev`:
 ```bash
-sudo apt install gcc              # alternatively, you can install clang instead
+sudo apt install gcc g++             # alternatively, you can install clang instead
 sudo apt-get install libpq-dev
 ```
 Install Qiita (this occurs through setuptools' `setup.py` file in the qiita directory):

From f12e9376e0a43a7d2a6a1d1b36eb6ef6045ae44e Mon Sep 17 00:00:00 2001
From: Stefan Janssen <sjanssen2@users.noreply.github.com>
Date: Thu, 15 Feb 2024 15:38:57 +0100
Subject: [PATCH 2/6] Update INSTALL.md (#3357)

travis is no longer used, thus better point to the github action workflow files
---
 INSTALL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/INSTALL.md b/INSTALL.md
index 6d129c06e..071de9705 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -178,7 +178,7 @@ At this point, Qiita will be installed and the system will start. However,
 you will need to install plugins in order to process any kind of data. For a list
 of available plugins, visit the [Qiita Spots](https://github.com/qiita-spots)
 github organization. Each of the plugins have their own installation instructions, we
-suggest looking at each individual .travis.yml file to see detailed installation
+suggest looking at each individual .github/workflows/qiita-plugin-ci.yml file to see detailed installation
 instructions. Note that the most common plugins are:
 - [qtp-biom](https://github.com/qiita-spots/qtp-biom)
 - [qtp-sequencing](https://github.com/qiita-spots/qtp-sequencing)

From f50fa22df60b07f76785255742526227e453c41f Mon Sep 17 00:00:00 2001
From: Stefan Janssen <sjanssen2@users.noreply.github.com>
Date: Thu, 15 Feb 2024 15:39:09 +0100
Subject: [PATCH 3/6] Patch 5 (#3356)

* Update CHANGELOG.md

* Update INSTALL.md

In the "Configure NGINX and supervisor" section, brackets for links were flipped

---------

Co-authored-by: Antonio Gonzalez <antgonza@gmail.com>
---
 INSTALL.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/INSTALL.md b/INSTALL.md
index 071de9705..a2d0152f2 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -224,15 +224,15 @@ export REDBIOM_HOST=http://my_host.com:7379
 
 ## Configure NGINX and supervisor
 
-(NGINX)[https://www.nginx.com/] is not a requirement for Qiita development but it's highly recommended for deploys as this will allow us
-to have multiple workers. Note that we are already installing (NGINX)[https://www.nginx.com/] within the Qiita conda environment; also,
-that Qiita comes with an example (NGINX)[https://www.nginx.com/]  config file: `qiita_pet/nginx_example.conf`, which is used in the Travis builds.
+[NGINX](https://www.nginx.com/) is not a requirement for Qiita development but it's highly recommended for deploys as this will allow us
+to have multiple workers. Note that we are already installing [NGINX](https://www.nginx.com/) within the Qiita conda environment; also,
+that Qiita comes with an example [NGINX](https://www.nginx.com/)  config file: `qiita_pet/nginx_example.conf`, which is used in the Travis builds.
 
-Now, (supervisor)[https://github.com/Supervisor/supervisor] will allow us to start all the workers we want based on its configuration file; and we
-need that both the (NGINX)[https://www.nginx.com/] and (supervisor)[https://github.com/Supervisor/supervisor] config files to match. For our Travis
+Now, [supervisor](https://github.com/Supervisor/supervisor) will allow us to start all the workers we want based on its configuration file; and we
+need that both the [NGINX](https://www.nginx.com/) and [supervisor](https://github.com/Supervisor/supervisor) config files to match. For our Travis
 testing we are creating 3 workers: 21174 for master and 21175-6 as a regular workers.
 
-If you are using (NGINX)[https://www.nginx.com/] via conda, you are going to need to create the NGINX folder within the environment; thus run:
+If you are using [NGINX](https://www.nginx.com/) via conda, you are going to need to create the NGINX folder within the environment; thus run:
 
 ```bash
 mkdir -p ${CONDA_PREFIX}/var/run/nginx/

From 58e15a470919b1183c6caf5af5f36f990f37f9ce Mon Sep 17 00:00:00 2001
From: Stefan Janssen <sjanssen2@users.noreply.github.com>
Date: Thu, 15 Feb 2024 15:39:29 +0100
Subject: [PATCH 4/6] Update INSTALL.md (#3358)

it might be worth letting the user know, that there is a default admin account that he/she can use. Especially useful to see the list of errors
---
 INSTALL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/INSTALL.md b/INSTALL.md
index a2d0152f2..89b63cabb 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -256,7 +256,7 @@ Start the qiita server:
 qiita pet webserver start
 ```
 
-If all the above commands executed correctly, you should be able to access Qiita by going in your browser to https://localhost:21174 if you are not using NGINX, or https://localhost:8383 if you are using NGINX, to login use `test@foo.bar` and `password` as the credentials. (In the future, we will have a *single user mode* that will allow you to use a local Qiita server without logging in. You can track progress on this on issue [#920](https://github.com/biocore/qiita/issues/920).)
+If all the above commands executed correctly, you should be able to access Qiita by going in your browser to https://localhost:21174 if you are not using NGINX, or https://localhost:8383 if you are using NGINX, to login use `test@foo.bar` and `password` as the credentials. (Login as `admin@foo.bar` with `password` to see admin functionality. In the future, we will have a *single user mode* that will allow you to use a local Qiita server without logging in. You can track progress on this on issue [#920](https://github.com/biocore/qiita/issues/920).)
 
 
 

From 1aec6b0014cd9d68e53f9fcc07e1a9da04aceb0c Mon Sep 17 00:00:00 2001
From: Antonio Gonzalez <antgonza@gmail.com>
Date: Tue, 20 Feb 2024 11:52:11 -0700
Subject: [PATCH 5/6] add rna_copy_counts (#3351)

* add rna_copy_counts

* RNA -> Calculate RNA

* v != '*'

* v != '*' : fix conditional

* prep job only display if success

* allowing for multiple inputs in workflow

* fix error

* just one element

* rollback add_default_workflow

* simplify add_default_workflow
---
 qiita_db/metadata_template/prep_template.py   | 208 +++++++++---------
 qiita_db/processing_job.py                    |   2 +-
 .../handlers/study_handlers/prep_template.py  |   9 +-
 3 files changed, 109 insertions(+), 110 deletions(-)

diff --git a/qiita_db/metadata_template/prep_template.py b/qiita_db/metadata_template/prep_template.py
index f39aaacb7..d05493d3f 100644
--- a/qiita_db/metadata_template/prep_template.py
+++ b/qiita_db/metadata_template/prep_template.py
@@ -793,6 +793,7 @@ def _get_node_info(workflow, node):
         def _get_predecessors(workflow, node):
             # recursive method to get predecessors of a given node
             pred = []
+
             for pnode in workflow.graph.predecessors(node):
                 pred = _get_predecessors(workflow, pnode)
                 cxns = {x[0]: x[2]
@@ -864,7 +865,8 @@ def _get_predecessors(workflow, node):
                 if wk_params['sample']:
                     df = ST(self.study_id).to_dataframe(samples=list(self))
                     for k, v in wk_params['sample'].items():
-                        if k not in df.columns or v not in df[k].unique():
+                        if k not in df.columns or (v != '*' and v not in
+                                                   df[k].unique()):
                             reqs_satisfied = False
                         else:
                             total_conditions_satisfied += 1
@@ -872,7 +874,8 @@ def _get_predecessors(workflow, node):
                 if wk_params['prep']:
                     df = self.to_dataframe()
                     for k, v in wk_params['prep'].items():
-                        if k not in df.columns or v not in df[k].unique():
+                        if k not in df.columns or (v != '*' and v not in
+                                                   df[k].unique()):
                             reqs_satisfied = False
                         else:
                             total_conditions_satisfied += 1
@@ -890,117 +893,112 @@ def _get_predecessors(workflow, node):
 
         # let's just keep one, let's give it preference to the one with the
         # most total_conditions_satisfied
-        workflows = sorted(workflows, key=lambda x: x[0], reverse=True)[:1]
+        _, wk = sorted(workflows, key=lambda x: x[0], reverse=True)[0]
         missing_artifacts = dict()
-        for _, wk in workflows:
-            missing_artifacts[wk] = dict()
-            for node, degree in wk.graph.out_degree():
-                if degree != 0:
-                    continue
-                mscheme = _get_node_info(wk, node)
-                if mscheme not in merging_schemes:
-                    missing_artifacts[wk][mscheme] = node
-            if not missing_artifacts[wk]:
-                del missing_artifacts[wk]
+        for node, degree in wk.graph.out_degree():
+            if degree != 0:
+                continue
+            mscheme = _get_node_info(wk, node)
+            if mscheme not in merging_schemes:
+                missing_artifacts[mscheme] = node
         if not missing_artifacts:
             # raises option b.
             raise ValueError('This preparation is complete')
 
         # 3.
-        for wk, wk_data in missing_artifacts.items():
-            previous_jobs = dict()
-            for ma, node in wk_data.items():
-                predecessors = _get_predecessors(wk, node)
-                predecessors.reverse()
-                cmds_to_create = []
-                init_artifacts = None
-                for i, (pnode, cnode, cxns) in enumerate(predecessors):
-                    cdp = cnode.default_parameter
-                    cdp_cmd = cdp.command
-                    params = cdp.values.copy()
-
-                    icxns = {y: x for x, y in cxns.items()}
-                    reqp = {x: icxns[y[1][0]]
-                            for x, y in cdp_cmd.required_parameters.items()}
-                    cmds_to_create.append([cdp_cmd, params, reqp])
-
-                    info = _get_node_info(wk, pnode)
-                    if info in merging_schemes:
-                        if set(merging_schemes[info]) >= set(cxns):
-                            init_artifacts = merging_schemes[info]
-                            break
-                if init_artifacts is None:
-                    pdp = pnode.default_parameter
-                    pdp_cmd = pdp.command
-                    params = pdp.values.copy()
-                    # verifying that the workflow.artifact_type is included
-                    # in the command input types or raise an error
-                    wkartifact_type = wk.artifact_type
-                    reqp = dict()
-                    for x, y in pdp_cmd.required_parameters.items():
-                        if wkartifact_type not in y[1]:
-                            raise ValueError(f'{wkartifact_type} is not part '
-                                             'of this preparation and cannot '
-                                             'be applied')
-                        reqp[x] = wkartifact_type
-
-                    cmds_to_create.append([pdp_cmd, params, reqp])
-
-                    if starting_job is not None:
-                        init_artifacts = {
-                            wkartifact_type: f'{starting_job.id}:'}
-                    else:
-                        init_artifacts = {wkartifact_type: self.artifact.id}
-
-                cmds_to_create.reverse()
-                current_job = None
-                loop_starting_job = starting_job
-                for i, (cmd, params, rp) in enumerate(cmds_to_create):
-                    if loop_starting_job is not None:
-                        previous_job = loop_starting_job
-                        loop_starting_job = None
-                    else:
-                        previous_job = current_job
-                    if previous_job is None:
-                        req_params = dict()
-                        for iname, dname in rp.items():
-                            if dname not in init_artifacts:
-                                msg = (f'Missing Artifact type: "{dname}" in '
-                                       'this preparation; this might be due '
-                                       'to missing steps or not having the '
-                                       'correct raw data.')
-                                # raises option c.
-                                raise ValueError(msg)
-                            req_params[iname] = init_artifacts[dname]
-                    else:
-                        req_params = dict()
-                        connections = dict()
-                        for iname, dname in rp.items():
-                            req_params[iname] = f'{previous_job.id}{dname}'
-                            connections[dname] = iname
-                    params.update(req_params)
-                    job_params = qdb.software.Parameters.load(
-                        cmd, values_dict=params)
-
-                    if params in previous_jobs.values():
-                        for x, y in previous_jobs.items():
-                            if params == y:
-                                current_job = x
+        previous_jobs = dict()
+        for ma, node in missing_artifacts.items():
+            predecessors = _get_predecessors(wk, node)
+            predecessors.reverse()
+            cmds_to_create = []
+            init_artifacts = None
+            for i, (pnode, cnode, cxns) in enumerate(predecessors):
+                cdp = cnode.default_parameter
+                cdp_cmd = cdp.command
+                params = cdp.values.copy()
+
+                icxns = {y: x for x, y in cxns.items()}
+                reqp = {x: icxns[y[1][0]]
+                        for x, y in cdp_cmd.required_parameters.items()}
+                cmds_to_create.append([cdp_cmd, params, reqp])
+
+                info = _get_node_info(wk, pnode)
+                if info in merging_schemes:
+                    if set(merging_schemes[info]) >= set(cxns):
+                        init_artifacts = merging_schemes[info]
+                        break
+            if init_artifacts is None:
+                pdp = pnode.default_parameter
+                pdp_cmd = pdp.command
+                params = pdp.values.copy()
+                # verifying that the workflow.artifact_type is included
+                # in the command input types or raise an error
+                wkartifact_type = wk.artifact_type
+                reqp = dict()
+                for x, y in pdp_cmd.required_parameters.items():
+                    if wkartifact_type not in y[1]:
+                        raise ValueError(f'{wkartifact_type} is not part '
+                                         'of this preparation and cannot '
+                                         'be applied')
+                    reqp[x] = wkartifact_type
+
+                cmds_to_create.append([pdp_cmd, params, reqp])
+
+                if starting_job is not None:
+                    init_artifacts = {
+                        wkartifact_type: f'{starting_job.id}:'}
+                else:
+                    init_artifacts = {wkartifact_type: self.artifact.id}
+
+            cmds_to_create.reverse()
+            current_job = None
+            loop_starting_job = starting_job
+            for i, (cmd, params, rp) in enumerate(cmds_to_create):
+                if loop_starting_job is not None:
+                    previous_job = loop_starting_job
+                    loop_starting_job = None
+                else:
+                    previous_job = current_job
+                if previous_job is None:
+                    req_params = dict()
+                    for iname, dname in rp.items():
+                        if dname not in init_artifacts:
+                            msg = (f'Missing Artifact type: "{dname}" in '
+                                   'this preparation; this might be due '
+                                   'to missing steps or not having the '
+                                   'correct raw data.')
+                            # raises option c.
+                            raise ValueError(msg)
+                        req_params[iname] = init_artifacts[dname]
+                else:
+                    req_params = dict()
+                    connections = dict()
+                    for iname, dname in rp.items():
+                        req_params[iname] = f'{previous_job.id}{dname}'
+                        connections[dname] = iname
+                params.update(req_params)
+                job_params = qdb.software.Parameters.load(
+                    cmd, values_dict=params)
+
+                if params in previous_jobs.values():
+                    for x, y in previous_jobs.items():
+                        if params == y:
+                            current_job = x
+                else:
+                    if workflow is None:
+                        PW = qdb.processing_job.ProcessingWorkflow
+                        workflow = PW.from_scratch(user, job_params)
+                        current_job = [
+                            j for j in workflow.graph.nodes()][0]
                     else:
-                        if workflow is None:
-                            PW = qdb.processing_job.ProcessingWorkflow
-                            workflow = PW.from_scratch(user, job_params)
-                            current_job = [
-                                j for j in workflow.graph.nodes()][0]
+                        if previous_job is None:
+                            current_job = workflow.add(
+                                job_params, req_params=req_params)
                         else:
-                            if previous_job is None:
-                                current_job = workflow.add(
-                                    job_params, req_params=req_params)
-                            else:
-                                current_job = workflow.add(
-                                    job_params, req_params=req_params,
-                                    connections={previous_job: connections})
-                        previous_jobs[current_job] = params
+                            current_job = workflow.add(
+                                job_params, req_params=req_params,
+                                connections={previous_job: connections})
+                    previous_jobs[current_job] = params
 
         return workflow
 
diff --git a/qiita_db/processing_job.py b/qiita_db/processing_job.py
index dcce029a6..a1f7e5baa 100644
--- a/qiita_db/processing_job.py
+++ b/qiita_db/processing_job.py
@@ -1020,7 +1020,7 @@ def submit(self, parent_job_id=None, dependent_jobs_list=None):
         # names to know if it should be executed differently and the
         # plugin should let Qiita know that a specific command should be ran
         # as job array or not
-        cnames_to_skip = {'Calculate Cell Counts'}
+        cnames_to_skip = {'Calculate Cell Counts', 'Calculate RNA Copy Counts'}
         if 'ENVIRONMENT' in plugin_env_script and cname not in cnames_to_skip:
             # the job has to be in running state so the plugin can change its`
             # status
diff --git a/qiita_pet/handlers/study_handlers/prep_template.py b/qiita_pet/handlers/study_handlers/prep_template.py
index 0af9949e3..167f981bd 100644
--- a/qiita_pet/handlers/study_handlers/prep_template.py
+++ b/qiita_pet/handlers/study_handlers/prep_template.py
@@ -81,11 +81,12 @@ def get(self):
             res['creation_job_filename'] = fp['filename']
             res['creation_job_filename_body'] = fp['body']
             summary = None
-            if res['creation_job'].outputs:
-                summary = relpath(
+            if res['creation_job'].status == 'success':
+                if res['creation_job'].outputs:
                     # [0] is the id, [1] is the filepath
-                    res['creation_job'].outputs['output'].html_summary_fp[1],
-                    qiita_config.base_data_dir)
+                    _file = res['creation_job'].outputs[
+                        'output'].html_summary_fp[1]
+                    summary = relpath(_file, qiita_config.base_data_dir)
             res['creation_job_artifact_summary'] = summary
 
         self.render('study_ajax/prep_summary.html', **res)

From 57b84cf1d866b0d7f3cc84693c0e4eca894bb1c4 Mon Sep 17 00:00:00 2001
From: Stefan Janssen <sjanssen2@users.noreply.github.com>
Date: Tue, 20 Feb 2024 19:53:56 +0100
Subject: [PATCH 6/6] fix environment_script for private plugins (#3359)

* fix environment_script for private plugins

I found that patch 54.sql for the test database uses an old conda activate mechanism for travis. We might want to fix this to the latest github action method of choice?

* Update qiita-ci.yml

* Update qiita-ci.yml

fix quoting
---
 .github/workflows/qiita-ci.yml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.github/workflows/qiita-ci.yml b/.github/workflows/qiita-ci.yml
index bbc6c25f5..6c3b06be1 100644
--- a/.github/workflows/qiita-ci.yml
+++ b/.github/workflows/qiita-ci.yml
@@ -154,6 +154,8 @@ jobs:
 
           echo "5. Setting up qiita"
           conda activate qiita
+          # adapt environment_script for private qiita plugins from travis to github actions.
+          sed 's#export PATH="/home/travis/miniconda3/bin:$PATH"; source #source /home/runner/.profile; conda #' -i qiita_db/support_files/patches/54.sql
           qiita-env make --no-load-ontologies
           qiita-test-install
           qiita plugins update