Skip to content

Commit

Permalink
Merge pull request #26 from EBI-Metagenomics/develop
Browse files Browse the repository at this point in the history
Removed Aspera, added boto3 + FIRE and some refactoring
  • Loading branch information
mberacochea authored Jul 11, 2024
2 parents f239c89 + 939deda commit d7f6bae
Show file tree
Hide file tree
Showing 16 changed files with 280 additions and 256 deletions.
4 changes: 1 addition & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, "3.10"]
python-version: ["3.9", "3.10", "3.11"]

steps:
- uses: actions/checkout@v2
Expand All @@ -22,8 +22,6 @@ jobs:
- name: Install Dependencies
run: |
pip install .[test]
# Aspera installation #
. install-aspera.sh
- name: 🧪 - Testing
run: |
pytest -v
25 changes: 13 additions & 12 deletions Containerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,22 @@
FROM python:3.9-slim
FROM mambaorg/micromamba:1.5.8

LABEL maintainer="Microbiome Informatics"
LABEL version="0.9.0"
LABEL description="EBI Fetch Tool Docker Image."
LABEL version="1.0.0"
LABEL description="EBI Fetch Tool."

# We need curl to download aspera and ps for nextflow monitoring
ENV DEBIAN_FRONTEND=noninteractive
COPY --chown=$MAMBA_USER:$MAMBA_USER conda_environment.yml /tmp/env.yaml

RUN apt update && apt install -y curl procps && rm -rf /var/lib/apt/lists/*
RUN micromamba install -y -n base -f /tmp/env.yaml && \
micromamba clean --all --yes

COPY . .
ARG MAMBA_DOCKERFILE_ACTIVATE=1

RUN pip install --no-cache-dir .
ENV PATH="$MAMBA_ROOT_PREFIX/bin:$PATH"

# Aspera is an IBM library for data sharing
RUN ./install-aspera.sh
COPY --chown=$MAMBA_USER:$MAMBA_USER . /opt/fetch-tool-src

RUN export PATH=$PATH:/aspera-cli/cli/bin
WORKDIR /opt/fetch-tool-src

CMD [ fetch-read-tool ]
RUN pip install . --no-cache-dir

ENTRYPOINT ["/usr/local/bin/_entrypoint.sh"]
36 changes: 8 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,19 +35,12 @@ pre-commit will run a set of pre-configured tools before allowing you to commit

This repo uses [pytest](https://docs.pytest.org).

It requires the aspera cli installed in the default location (`install-aspera.sh` with no parameters).

To run the test suite:
```bash
pytest
```

## Install fetch tool

### Using Conda

```bash
$ conda create -q -n fetch_tool python=3.8
$ conda create -q -n fetch_tool python=3.10
$ conda activate fetch_tool
```

Expand All @@ -60,33 +53,20 @@ $ pip install fetch-tool
Install from the git repo

```bash
$ pip install git+ssh://git@github.com/EBI-Metagenomics/fetch_tool.git
$ pip install https://github.com/EBI-Metagenomics/fetch_tool/archive/master.zip
```

#### Configuration file
#### Configuration options

The tool has a number of options, with sensible defaults for the most common use cases.

Setup the configuration file, the template [fetchdata-config-template.json](config/fetchdata-config-template.json) for the configuration file.

The required fields are:
- For Aspera
- aspera_bin (the path to ascp, usually in the aspera installation under /cli/bin)
- aspera_cert (the path to the ascp provided cert, usually in the aspera installation under /cli/etc/asperaweb_id_dsa.openssh)
- To pull private ENA data
-
- ena_api_user
- ena_api_password

### Install Aspera

## Install

Run the `install-aspera.sh` command here, it has only one optional parameter (the installation folder).

```bash
./install path/to/installation-i-want
```

Otherwise it will install it in $PWD/aspera-cli

## Fetch read files (amplicon and WGS data)

### Usage
Expand Down Expand Up @@ -122,7 +102,7 @@ optional arguments:
Download amplicon study:

```bash
$ fetch-read-tool -p SRP062869 -c fetchdata-config-local.json -v -d /home/<user>/temp/
$ fetch-read-tool -p SRP062869 -v -d /home/<user>/temp/
```

## Fetch assembly files
Expand Down Expand Up @@ -163,5 +143,5 @@ optional arguments:
Download assembly study:

```bash
$ fetch-assembly-tool -p ERP111288 -c fetchdata-config-local.json -v -d /home/<user>/temp/
$ fetch-assembly-tool -p ERP111288 -v -d /home/<user>/temp/
```
16 changes: 16 additions & 0 deletions conda_environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: fetchtool
channels:
- bioconda
- conda-forge
- defaults
dependencies:
- python=3.10
- pip=24.0
- conda-forge::procps-ng=4.0.4
- conda-forge::wget=1.21.4
- conda-forge::rsync=3.3.0
- conda-forge::pandas=2.2.2
- pip:
- requests==2.32.3
- flufl.lock==8.1.0
- boto3==1.34.134
4 changes: 1 addition & 3 deletions config/testing.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
{
"url_max_attempts": 5,
"ena_api_username": "",
"ena_api_password": "",
"aspera_bin": "",
"aspera_cert": ""
"ena_api_password": ""
}
2 changes: 1 addition & 1 deletion fetchtool/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.9.0"
__version__ = "1.0.0"
Loading

0 comments on commit d7f6bae

Please sign in to comment.