Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Docker setup for Beagle Imputation WDL #114

Draft
wants to merge 3 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions 3rd-party-tools/beagle/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Adding a platform tag to ensure that images built on ARM-based machines doesn't break pipelines
FROM --platform="linux/amd64" adoptopenjdk/openjdk8:alpine-slim

ARG BEAGLE_VERSION=01Mar24.d36 \
BREF3_VERSION=22Jul22.46e \
BCFTOOLS_VERSION=1.10.2

ENV TERM=xterm-256color \
TINI_VERSION=v0.19.0

LABEL MAINTAINER="Broad Institute DSDE <dsde-engineering@broadinstitute.org>" \
BEAGLE_VERSION=${BEAGLE_VERSION} \
BCFTOOLS_VERSION=${BCFTOOLS_VERSION}

WORKDIR /usr/gitc

# Install dependencies
RUN set -eux; \
apk add --no-cache \
autoconf \
automake \
bash \
bzip2-dev \
curl \
g++ \
gcc \
gsl-dev \
make \
musl-dev \
perl \
perl-dev \
tini \
wget \
xz-dev \
zlib-dev \
; \
# Install BCFTools
wget https://github.com/samtools/bcftools/releases/download/${BCFTOOLS_VERSION}/bcftools-${BCFTOOLS_VERSION}.tar.bz2; \
tar xf bcftools-${BCFTOOLS_VERSION}.tar.bz2; \
cd bcftools-${BCFTOOLS_VERSION}; \
\
./configure; \
make; \
make install; \
\
cd ../; \
rm -r bcftools-${BCFTOOLS_VERSION}; \
rm bcftools-${BCFTOOLS_VERSION}.tar.bz2 \
; \
# Download Beagle jars
# beagle runs phasing and imputation
curl -L https://faculty.washington.edu/browning/beagle/beagle.${BEAGLE_VERSION}.jar > beagle.${BEAGLE_VERSION}.jar \
; \
# bref3 converts a reference panel from vcf to the bref3 format that Beagle needs
curl -L https://faculty.washington.edu/browning/beagle/bref3.${BREF3_VERSION}.jar > bref3.${BREF3_VERSION}.jar \
; \
# Install tini
wget https://github.com/krallin/tini/releases/download/$TINI_VERSION/tini -O /sbin/tini; \
chmod +x /sbin/tini;

# Set tini as default entrypoint
ENTRYPOINT ["/sbin/tini", "--" ]
35 changes: 35 additions & 0 deletions 3rd-party-tools/beagle/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Imputation Beagle

## Quick reference

Copy and paste to pull this image

#### `us-central1-docker.pkg.dev/morgan-fieldeng-gcp/imputation-beagle-development/imputation-beagle:0.0.1-01Mar24.d36-wip-temp-20240301`

- __What is this image:__ This image is a lightweight alpine-based image for running Beagle in the [ImputationBeagle pipeline](../../../../pipelines/broad/arrays/imputation_beagle/ImputationBeagle.wdl).
- __What is Beagle:__ Beagle is a software package for phasing genotypes and imputing ungenotyped markers. Beagle version 5.4 has improved memory and computational efficiency when analyzing large sequence data sets. See [here](https://faculty.washington.edu/browning/beagle/beagle.html) for more information.
- __How to see Beagle version used in image:__ Please see below.

## Versioning

The Imputation Beagle image uses the following convention for versioning:

#### `us-central1-docker.pkg.dev/morgan-fieldeng-gcp/imputation-beagle-development/imputation-beagle:<image-version>-<beagle-version>-<manual-timestamp>`

We keep track of all past versions in [docker_versions](docker_versions.tsv) with the last image listed being the currently used version in WARP.

You can see more information about the image, including the tool versions, by running the following command:

```bash
$ docker pull us-central1-docker.pkg.dev/morgan-fieldeng-gcp/imputation-beagle-development/imputation-beagle:0.0.1-01Mar24.d36-wip-temp-20240301
$ docker inspect us-central1-docker.pkg.dev/morgan-fieldeng-gcp/imputation-beagle-development/imputation-beagle:0.0.1-01Mar24.d36-wip-temp-20240301
```

## Usage

### Display default menu

```bash
$ docker run --rm -it \
us-central1-docker.pkg.dev/morgan-fieldeng-gcp/imputation-beagle-development/imputation-beagle:0.0.1-01Mar24.d36-wip-temp-20240301 /usr/gitc/beagle
```
78 changes: 78 additions & 0 deletions 3rd-party-tools/beagle/docker_build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#!/bin/bash
set -e

# Update version when changes to Dockerfile are made
DOCKER_IMAGE_VERSION=0.0.1
TIMESTAMP=$(date +"%s")
DIR=$(cd $(dirname $0) && pwd)

# Registries and tags
# GCR_URL="us.gcr.io/broad-gotc-prod/imputation-beagle"

# GAR setup
GAR_REGION="us-central1"
GAR_PROJECT="morgan-fieldeng-gcp"
GAR_REPOSITORY="imputation-beagle-development"
GAR_IMAGE="imputation-beagle"
GAR_URL="${GAR_REGION}-docker.pkg.dev/${GAR_PROJECT}/${GAR_REPOSITORY}/${GAR_IMAGE}"

# Beagle version
BEAGLE_VERSION="01Mar24.d36"

# Necessary tools and help text
TOOLS=(docker gcloud)
HELP="$(basename "$0") [-h|--help] [-b|--beagle] [-t|--tools] -- script to build the Imputation Beagle image and push to GAR

where:
-h|--help Show help text
-b|--beagle Version of Beagle to use (default: BEAGLE_VERSION=${BEAGLE_VERSION})
-t|--tools Show tools needed to run script
"

function main(){
for t in "${TOOLS[@]}"; do which $t >/dev/null || ok=no; done
if [[ $ok == no ]]; then
echo "Missing one of the following tools: "
for t in "${TOOLS[@]}"; do echo "$t"; done
exit 1
fi

while [[ $# -gt 0 ]]
do
key="$1"
case $key in
-b|--beagle)
BEAGLE_VERSION="$2"
shift
shift
;;
-h|--help)
echo "$HELP"
exit 0
;;
-t|--tools)
for t in "${TOOLS[@]}"; do echo $t; done
exit 0
;;
*)
shift
;;
esac
done

IMAGE_TAG="$DOCKER_IMAGE_VERSION-$BEAGLE_VERSION-$TIMESTAMP"

echo "building and pushing GCR Image - $GAR_URL:$IMAGE_TAG"

# TODO: add `--squash` when ready to productionize. https://docs.docker.com/reference/cli/docker/image/build/#squash
docker build -t "$GAR_URL:$IMAGE_TAG" \
--build-arg BEAGLE_VERSION="$BEAGLE_VERSION" \
$DIR
# --no-cache $DIR\
docker push "$GAR_URL:$IMAGE_TAG"

echo -e "$GAR_URL:$IMAGE_TAG" >> "$DIR/docker_versions.tsv"
echo "done"
}

main "$@"
2 changes: 2 additions & 0 deletions 3rd-party-tools/beagle/docker_versions.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
us-central1-docker.pkg.dev/morgan-fieldeng-gcp/imputation-beagle-development/imputation-beagle:0.0.1-22Jul22.46e-wip-temp-20240227
us-central1-docker.pkg.dev/morgan-fieldeng-gcp/imputation-beagle-development/imputation-beagle:0.0.1-01Mar24.d36-wip-temp-20240301
Loading