Add relatedness test workflow #29

iamh2o · 2025-09-09T04:19:05Z

Summary

add Somalier/Picard/Conpair/Peddy relatedness QC workflow
include environment specs and example configuration
generate consolidated relatedness report script

Testing

python -m py_compile workflow/scripts/relatedness_report.py
pytest -q

https://chatgpt.com/codex/tasks/task_e_68ad9dc07cfc83318a7219f999bbe46b

Copilot

Pull Request Overview

This PR adds a comprehensive relatedness test workflow that integrates Somalier, Picard, Conpair, and Peddy tools for quality control analysis. The workflow can process both BAM and VCF files to assess sample relationships and generate consolidated reports.

Implements a multi-tool relatedness QC pipeline with Somalier, Picard CrosscheckFingerprints, Conpair, and optional Peddy
Creates consolidated reporting script that evaluates expected relationships against configurable thresholds
Provides environment specifications and example configuration for the complete workflow

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

File	Description
workflow/scripts/relatedness_report.py	Main reporting script that processes outputs from all tools and generates TSV/HTML reports
workflow/rules/relatedness_test_day.smk	Snakemake workflow rules defining the complete relatedness testing pipeline
workflow/envs/*.yaml	Conda environment specifications for each tool (Somalier, Picard, Conpair, Peddy, reporting)
config/relatedness.yaml	Example configuration file showing required parameters and sample definitions

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-09T04:20:10Z

workflow/rules/relatedness_test_day.smk

+    wildcards:
+        t=[p[0] for p in tn_pairs()],
+        n=[p[1] for p in tn_pairs()]

-rule conpair_compare_all:
-    input:
-        expand(
-            "results/conpair/{t}__{n}/concordance.tsv",
-            t=[p[0] for p in tn_pairs()],
-            n=[p[1] for p in tn_pairs()]
-        ),
-        expand(
-            "results/conpair/{t}__{n}/summary.txt",
-            t=[p[0] for p in tn_pairs()],
-            n=[p[1] for p in tn_pairs()]
-        )
+use rule conpair_parse as conpair_parse_all with:
+    wildcards:
+        t=[p[0] for p in tn_pairs()],
+        n=[p[1] for p in tn_pairs()]
+
+use rule conpair_compare as conpair_compare_all with:
+    wildcards:
+        t=[p[0] for p in tn_pairs()],
+        n=[p[1] for p in tn_pairs()]


The use rule ... with: syntax should use input: instead of wildcards: for expanding over multiple wildcard combinations. The current syntax is incorrect and will likely cause Snakemake execution errors.

Copilot · 2025-09-09T04:20:11Z

workflow/scripts/relatedness_report.py

+    except Exception:
+        # Some versions write CSV-like; be lenient
+        df = pd.read_csv(f)
+    df.columns = [c.lower() for c in df.columns]


Lowercasing all column names unconditionally may cause issues if the original column names are needed elsewhere. This could break column references if Conpair outputs use specific case-sensitive column names that are expected downstream.

Suggested change

df.columns = [c.lower() for c in df.columns]

Copilot · 2025-09-09T04:20:11Z

workflow/scripts/relatedness_report.py

+    df["sample_a"] = pair[0]
+    df["sample_b"] = pair[1]


This code assumes the directory structure always contains exactly two elements when split on '__', but doesn't handle cases where the split results in fewer than 2 elements, which would cause an IndexError.

Suggested change

df["sample_a"] = pair[0]

df["sample_b"] = pair[1]

if len(pair) >= 2:

df["sample_a"] = pair[0]

df["sample_b"] = pair[1]

else:

# Fallback: assign the whole name to sample_a, empty string to sample_b

df["sample_a"] = pair[0]

df["sample_b"] = ""

Copilot · 2025-09-09T04:20:11Z

workflow/scripts/relatedness_report.py

+    elif rel == "tumor_normal":
+        if som_rel is not None and som_rel < THRESH["tn_relatedness"]:
+            status = "FAIL"; fail_reasons.append(f"Somalier.relatedness={som_rel:.3f} < {THRESH['tn_relatedness']}")
+        if pic is not None and pic["lod"] < 0 and pic["lod"] <= THRESH["picard_mismatch_lod"]:


The condition pic[\"lod\"] < 0 and pic[\"lod\"] <= THRESH[\"picard_mismatch_lod\"] is redundant since THRESH["picard_mismatch_lod"] is -5.0. The first condition < 0 is always true when the second condition <= -5.0 is true. Simplify to just pic[\"lod\"] <= THRESH[\"picard_mismatch_lod\"].

Suggested change

if pic is not None and pic["lod"] < 0 and pic["lod"] <= THRESH["picard_mismatch_lod"]:

if pic is not None and pic["lod"] <= THRESH["picard_mismatch_lod"]:

Add relatedness test workflow

d73bf4d

Copilot AI review requested due to automatic review settings September 9, 2025 04:19

iamh2o added the codex label Sep 9, 2025 — with ChatGPT Codex Connector

Copilot AI reviewed Sep 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add relatedness test workflow #29

Add relatedness test workflow #29

Uh oh!

iamh2o commented Sep 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 9, 2025

Uh oh!

Copilot AI Sep 9, 2025

Uh oh!

Copilot AI Sep 9, 2025

Uh oh!

Copilot AI Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if pic is not None and pic["lod"] < 0 and pic["lod"] <= THRESH["picard_mismatch_lod"]:
	if pic is not None and pic["lod"] <= THRESH["picard_mismatch_lod"]:

Add relatedness test workflow #29

Are you sure you want to change the base?

Add relatedness test workflow #29

Uh oh!

Conversation

iamh2o commented Sep 9, 2025

Summary

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant