Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
b978938
Merge pull request #1 from MoTrPAC/fix/python312-compatibility
mihirsamdarshi Jul 20, 2025
3257b73
feat: add Google Batch backend support
mihirsamdarshi Apr 8, 2024
8f7beec
fix: remove fully deprecated Genomics/Life Sciences APIs
mihirsamdarshi Jul 20, 2025
daa3a7e
refactor: continue removing Life Sciences API support, update for Bat…
mihirsamdarshi Jul 21, 2025
f048494
docs: update documentation and other scripts to remove Life Sciences …
mihirsamdarshi Jul 21, 2025
dc622a2
feat: add support for customizing Google Batch compute service account
mihirsamdarshi Jul 26, 2025
c9581b4
feat: add GCP compute service account support in tests and update tes…
mihirsamdarshi Jul 26, 2025
8cd04cc
feat: add GCP logging policy support and update Cromwell/Womtool vers…
mihirsamdarshi Jul 26, 2025
34611a5
chore: move pytest configuration file
mihirsamdarshi Jul 26, 2025
0e6dfd1
chore: add `slow` marker to tests and update pytest configuration
mihirsamdarshi Jul 27, 2025
f281698
refactor: complete migration to pyproject.toml/uv
mihirsamdarshi Jul 27, 2025
790cb6e
fix: GCS URL formatting in test and add `slow` marker to test
mihirsamdarshi Jul 27, 2025
3ca12da
refactor: migrate to pyproject.toml, use uv as the package manager
mihirsamdarshi Jul 21, 2025
9999194
docs: update GCP docs to remove Life Sciences API references and refl…
mihirsamdarshi Jul 27, 2025
5ead741
test: add `google_cloud` markers to GCP-related resource analysis tests
mihirsamdarshi Jul 27, 2025
3b5afb9
ci: migrate from CircleCI to GitHub Actions for pytest workflow
mihirsamdarshi Jul 21, 2025
11b9e29
build: update CI/CD configuration to run fast tests on pushes or PRs …
mihirsamdarshi Jul 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 0 additions & 107 deletions .circleci/config.yml

This file was deleted.

141 changes: 141 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
name: Test

on:
push:
branches:
- master
- dev
pull_request:
branches:
- master
- dev

jobs:
pytest:
name: Test (all)
if: github.ref == 'refs/heads/master' || (github.event_name == 'pull_request' && github.base_ref == 'master')
runs-on: linux-self-hosted

permissions:
id-token: write
contents: read

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Configure GCP credentials
id: google-auth
uses: google-github-actions/auth@v2
with:
token_format: access_token
workload_identity_provider: ${{ secrets.WORKLOAD_IDENTITY_PROVIDER }}
service_account: ${{ secrets.SERVICE_ACCOUNT }}

- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v2

- name: Set up Python 3.12
id: setup-python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install uv on Linux or macOS
run: curl -sSfL https://astral.sh/uv/install.sh | bash

- name: Find uv cache and compute cache suffix
run: |
{
echo "UV_CACHE=$(uv cache dir)"
echo "HASH_CACHE_SUFFIX=-${{ hashFiles('pyproject.toml') }}"
} >> $GITHUB_ENV

- name: Load uv cache
uses: actions/cache@v4
with:
path: ${{ env.UV_CACHE }}
key: |
uv-cache-${{ env.HASH_CACHE_SUFFIX }}
uv-cache

- name: Setup a venv
run: |
uv venv
# write venv path to GITHUB_ENV
echo "VIRTUAL_ENV=$(pwd)/.venv" >> $GITHUB_ENV
echo "$(pwd)/.venv/bin" >> $GITHUB_PATH

- name: Install dependencies
run: uv sync --all-groups --all-extras

- name: Run pytest
env:
GCS_ROOT: ${{ secrets.GCS_ROOT }}
GOOGLE_PROJECT_ID: ${{ secrets.GOOGLE_PROJECT_ID }}
COMPUTE_SERVICE_ACCOUNT: ${{ secrets.GOOGLE_COMPUTE_SERVICE_ACCOUNT }}
run: |
uv run pytest --ci-prefix ${{ github.run_id }} \
--gcs-root ${GCS_ROOT} \
--gcp-prj ${GOOGLE_PROJECT_ID} \
--gcp-compute-service-account ${COMPUTE_SERVICE_ACCOUNT} \
--debug-caper \
-vv -s

# Always clean up
- name: Clean up
if: always()
run: |
gsutil -m rm -rf ${GCS_ROOT}/caper_out/${{ github.run_id }} || true

pytest-dev:
name: Test (w/o integration or Google Cloud tests)
if: github.ref == 'refs/heads/dev' || (github.event_name == 'pull_request' && github.base_ref == 'dev')
runs-on: linux-self-hosted

permissions:
contents: read

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python 3.12
id: setup-python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install uv on Linux or macOS
run: curl -sSfL https://astral.sh/uv/install.sh | bash

- name: Find uv cache and compute cache suffix
run: |
{
echo "UV_CACHE=$(uv cache dir)"
echo "HASH_CACHE_SUFFIX=-${{ hashFiles('pyproject.toml') }}"
} >> $GITHUB_ENV

- name: Load uv cache
uses: actions/cache@v4
with:
path: ${{ env.UV_CACHE }}
key: |
uv-cache-${{ env.HASH_CACHE_SUFFIX }}
uv-cache

- name: Setup a venv
run: |
uv venv
# write venv path to GITHUB_ENV
echo "VIRTUAL_ENV=$(pwd)/.venv" >> $GITHUB_ENV
echo "$(pwd)/.venv/bin" >> $GITHUB_PATH

- name: Install dependencies
run: uv sync --all-groups --all-extras

- name: Run pytest
run: |
uv run pytest -m "not integration and not google_cloud and not slow" \
--debug-caper \
-vv -s
10 changes: 0 additions & 10 deletions .isort.cfg

This file was deleted.

2 changes: 0 additions & 2 deletions DETAILS.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,8 +187,6 @@ We highly recommend to use a default configuration file described in the section
**Conf. file**|**Cmd. line**|**Description**
:-----|:-----|:-----
gcp-prj|--gcp-prj|Google Cloud project
use-google-cloud-life-sciences|--use-google-cloud-life-sciences|Use Google Cloud Life Sciences API instead of (deprecated) Genomics API
gcp-zones|--gcp-zones|Comma-delimited Google Cloud Platform zones to provision worker instances (e.g. us-central1-c,us-west1-b)
gcp-out-dir, out-gcs-bucket|--gcp-out-dir, --out-gcs-bucket|Output `gs://` directory for GC backend
gcp-loc-dir, tmp-gcs-bucket|--gcp-loc-dir, --tmp-gcs-bucket|Tmp. directory for localization on GC backend
gcp-call-caching-dup-strat|--gcp-call-caching-dup-strat|Call-caching duplication strategy. Choose between `copy` and `reference`. `copy` will make a copy for a new workflow, `reference` will make refer to the call-cached output of a previous workflow in `metadata.json`. Defaults to `reference`
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![CircleCI](https://circleci.com/gh/ENCODE-DCC/caper.svg?style=svg)](https://circleci.com/gh/ENCODE-DCC/caper)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![CircleCI](https://circleci.com/gh/ENCODE-DCC/caper.svg?style=svg)](https://circleci.com/gh/ENCODE-DCC/caper) [![GitHub Actions](https://github.com/ENCODE-DCC/caper/actions/workflows/pytest.yml/badge.svg)](https://github.com/ENCODE-DCC/caper/actions/workflows/pytest.yml)


## Introduction
Expand Down
13 changes: 0 additions & 13 deletions bin/caper

This file was deleted.

8 changes: 6 additions & 2 deletions caper/arg_tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@
from argparse import ArgumentParser
from configparser import ConfigParser, MissingSectionHeaderError

from distutils.util import strtobool


def read_from_conf(
conf_file, conf_section='defaults', conf_key_map=None, no_strip_quote=False
Expand Down Expand Up @@ -146,6 +144,12 @@ def update_parsers_defaults_with_conf(
v = guessed_default
defaults[k] = v

def strtobool(value: str) -> bool:
value = value.lower()
if value in ('y', 'yes', 'on', '1', 'true', 't'):
return True
return False

if guessed_type:
if guessed_type is bool and isinstance(v, str):
defaults[k] = bool(strtobool(v))
Expand Down
34 changes: 14 additions & 20 deletions caper/caper_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,9 @@ def get_parser_and_defaults(conf_file=None):
parent_all.add_argument(
'--gcp-service-account-key-json',
help='Secret key JSON file for Google Cloud Platform service account. '
'This service account should have enough permission to '
'Storage for client functions and '
'Storage/Compute Engine/Genomics API/Life Sciences API '
'for server/runner functions.',
'This service account should have enough permission to Storage for client '
'functions and Storage/Compute Engine/Batch API for server/runner functions. '
'We recommend using application default credentials for authentication.',
)

group_loc = parent_all.add_argument_group(
Expand Down Expand Up @@ -323,29 +322,24 @@ def get_parser_and_defaults(conf_file=None):
group_gc = parent_runner.add_argument_group(
title='GCP backend arguments for server/runner'
)
group_gc.add_argument('--gcp-prj', help='GC project')
group_gc_all.add_argument(
'--use-google-cloud-life-sciences',
action='store_true',
help='Use Google Cloud Life Sciences API (v2beta) instead of '
'deprecated Genomics API (v2alpha1).'
'Life Sciences API requires only one region specified with'
'gcp-region. gcp-zones will be ignored since it is for Genomics API.'
'See https://cloud.google.com/life-sciences/docs/concepts/locations '
'for supported regions.',
)
group_gc.add_argument('--gcp-prj', help='Google Cloud project')
group_gc.add_argument(
'--gcp-region',
default=CromwellBackendGcp.DEFAULT_REGION,
help='GCP region for Google Cloud Life Sciences API. '
'This is used only when --use-google-cloud-life-sciences is defined.',
help='GCP region for Google Cloud Batch API. ',
)
group_gc.add_argument(
'--gcp-compute-service-account',
help='Service account email to use for Google Cloud Batch compute instances. '
'This is *not* the service account used to launch the job, but the service account '
'used to actually run the job on the Batch VM instances. '
'Ensure that this service account has the `roles/batch.agentReporter` role, '
'so that VM instances can report their status to Batch.',
)
group_gc_all.add_argument(
'--gcp-zones',
help='Comma-separated GCP zones used for Genomics API. '
help='Comma-separated GCP zones used for Running jobs in Batch. '
'(e.g. us-west1-b,us-central1-b). '
'If you use --use-google-cloud-life-sciences then '
'define --gcp-region instead.',
)
group_gc.add_argument(
'--gcp-call-caching-dup-strat',
Expand Down
Loading
Loading