Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed Aspera, added boto3 + FIRE and some refactoring #26

Merged
merged 14 commits into from
Jul 11, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, "3.10"]
python-version: ["3.9", "3.10", "3.11"]

steps:
- uses: actions/checkout@v2
Expand All @@ -22,8 +22,6 @@ jobs:
- name: Install Dependencies
run: |
pip install .[test]
# Aspera installation #
. install-aspera.sh
- name: 🧪 - Testing
run: |
pytest -v
26 changes: 14 additions & 12 deletions Containerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,23 @@
FROM python:3.9-slim
FROM mambaorg/micromamba:1.5.8

LABEL maintainer="Microbiome Informatics"
LABEL version="0.9.0"
LABEL description="EBI Fetch Tool Docker Image."
LABEL version="1.0.0"
LABEL description="EBI Fetch Tool."

# We need curl to download aspera and ps for nextflow monitoring
ENV DEBIAN_FRONTEND=noninteractive
COPY --chown=$MAMBA_USER:$MAMBA_USER conda_environment.yml /tmp/env.yaml

RUN apt update && apt install -y curl procps && rm -rf /var/lib/apt/lists/*
RUN micromamba install -y -n base -f /tmp/env.yaml && \
micromamba clean --all --yes

COPY . .
ARG MAMBA_DOCKERFILE_ACTIVATE=1

ENV PATH="$MAMBA_ROOT_PREFIX/bin:$PATH"

RUN pip install --no-cache-dir .
WORKDIR /opt

# Aspera is an IBM library for data sharing
RUN ./install-aspera.sh
COPY . .

RUN export PATH=$PATH:/aspera-cli/cli/bin
ENV PATH="/opt/fetchtool:$PATH"
ENV PYTHONPATH="/opt/:$PYTHONPATH"

CMD [ fetch-read-tool ]
ENTRYPOINT ["/usr/local/bin/_entrypoint.sh"]
34 changes: 7 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,19 +35,12 @@ pre-commit will run a set of pre-configured tools before allowing you to commit

This repo uses [pytest](https://docs.pytest.org).

It requires the aspera cli installed in the default location (`install-aspera.sh` with no parameters).

To run the test suite:
```bash
pytest
```

## Install fetch tool

### Using Conda

```bash
$ conda create -q -n fetch_tool python=3.8
$ conda create -q -n fetch_tool python=3.9
$ conda activate fetch_tool
```

Expand All @@ -63,30 +56,17 @@ Install from the git repo
$ pip install git+ssh://git@github.com/EBI-Metagenomics/fetch_tool.git
```

#### Configuration file
#### Configuration options

The tool has a number of options, with sensible defaults for the most common use cases.

Setup the configuration file, the template [fetchdata-config-template.json](config/fetchdata-config-template.json) for the configuration file.

The required fields are:
- For Aspera
- aspera_bin (the path to ascp, usually in the aspera installation under /cli/bin)
- aspera_cert (the path to the ascp provided cert, usually in the aspera installation under /cli/etc/asperaweb_id_dsa.openssh)
- To pull private ENA data
-
- ena_api_user
- ena_api_password

### Install Aspera

## Install

Run the `install-aspera.sh` command here, it has only one optional parameter (the installation folder).

```bash
./install path/to/installation-i-want
```

Otherwise it will install it in $PWD/aspera-cli

## Fetch read files (amplicon and WGS data)

### Usage
Expand Down Expand Up @@ -122,7 +102,7 @@ optional arguments:
Download amplicon study:

```bash
$ fetch-read-tool -p SRP062869 -c fetchdata-config-local.json -v -d /home/<user>/temp/
$ fetch-read-tool -p SRP062869 -v -d /home/<user>/temp/
```

## Fetch assembly files
Expand Down Expand Up @@ -163,5 +143,5 @@ optional arguments:
Download assembly study:

```bash
$ fetch-assembly-tool -p ERP111288 -c fetchdata-config-local.json -v -d /home/<user>/temp/
$ fetch-assembly-tool -p ERP111288 -v -d /home/<user>/temp/
```
16 changes: 16 additions & 0 deletions conda_environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: fetchtool
channels:
- bioconda
- conda-forge
- defaults
dependencies:
- python=3.10
- pip=24.0
- conda-forge::procps-ng=4.0.4
- conda-forge::wget=1.21.4
- conda-forge::rsync=3.3.0
- conda-forge::pandas=2.2.2
- pip:
- requests==2.32.3
- flufl.lock==8.1.0
- boto3==1.34.134
4 changes: 1 addition & 3 deletions config/testing.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
{
"url_max_attempts": 5,
"ena_api_username": "",
"ena_api_password": "",
"aspera_bin": "",
"aspera_cert": ""
"ena_api_password": ""
}
2 changes: 1 addition & 1 deletion fetchtool/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.9.0"
__version__ = "1.0.0"
Loading
Loading