You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+61-20Lines changed: 61 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -5,11 +5,33 @@
5
5
6
6
## Download new Caper>=2.0
7
7
8
-
New Caper is out. You need to update your Caper to work with the latest ENCODE ATAC-seq pipeline.
8
+
New Caper is out. You need to update your Caper to work with the latest ENCODE ChIP-seq pipeline.
9
9
```bash
10
10
$ pip install caper --upgrade
11
11
```
12
12
13
+
## Local/HPC users and new Caper>=2.0
14
+
15
+
There are tons of changes for local/HPC backends: `local`, `slurm`, `sge`, `pbs` and `lsf`(added). Make a backup of your current Caper configuration file `~/.caper/default.conf` and run `caper init`. Local/HPC users need to reset/initialize Caper's configuration file according to your chosen backend. Edit the configuration file and follow instructions in there.
16
+
```bash
17
+
$ cd~/.caper
18
+
$ cp default.conf default.conf.bak
19
+
$ caper init [YOUR_BACKEND]
20
+
```
21
+
22
+
In order to run a pipeline, you need to add one of the following flags to specify the environment to run each task within. i.e. `--conda`, `--singularity` and `--docker`. These flags are not required for cloud backend users (`aws` and `gcp`).
23
+
```bash
24
+
# for example
25
+
$ caper run ... --singularity
26
+
```
27
+
28
+
For Conda users, **RE-INSTALL PIPELINE'S CONDA ENVIRONMENT AND DO NOT ACTIVATE CONDA ENVIRONMENT BEFORE RUNNING PIPELINES**. Caper will internally call `conda run -n ENV_NAME CROMWELL_JOB_SCRIPT`. Just make sure that pipeline's new Conda environments are correctly installed.
29
+
```bash
30
+
$ scripts/uninstall_conda_env.sh
31
+
$ scripts/install_conda_env.sh
32
+
```
33
+
34
+
13
35
## Introduction
14
36
This ChIP-Seq pipeline is based off the ENCODE (phase-3) transcription factor and histone ChIP-seq pipeline specifications (by Anshul Kundaje) in [this google doc](https://docs.google.com/document/d/1lG_Rd7fnYgRpSIqrIfuVlAz2dW1VaSQThzk836Db99c/edit#).
15
37
@@ -21,34 +43,35 @@ This ChIP-Seq pipeline is based off the ENCODE (phase-3) transcription factor an
21
43
22
44
## Installation
23
45
24
-
25
46
1) Make sure that you have Python>=3.6. Caper does not work with Python2. Install Caper and check its version >=2.0.
26
47
```bash
27
48
$ python --version
28
49
$ pip install caper
29
-
$ caper -v
30
50
```
51
+
2) Make a backup of your Caper configuration file `~/.caper/default.conf` if you are upgrading from old Caper(<2.0.0). Reset/initialize Caper's configuration file. Read Caper's [README](https://github.com/ENCODE-DCC/caper/blob/master/README.md) carefully to choose a backend for your system. Follow the instruction in the configuration file.
52
+
```bash
53
+
# make a backup of ~/.caper/default.conf if you already have it
54
+
$ caper init [YOUR_BACKEND]
31
55
32
-
2) Git clone this pipeline.
33
-
> **IMPORTANT**: use `~/chip-seq-pipeline2/chip.wdl` as `[WDL]` in Caper's documentation.
56
+
# then edit ~/.caper/default.conf
57
+
$ vi ~/.caper/default.conf
58
+
```
34
59
60
+
3) Git clone this pipeline.
61
+
> **IMPORTANT**: use `~/chip-seq-pipeline2/chip.wdl` as `[WDL]` in Caper's documentation.
3) (Optional for Conda) Install pipeline's Conda environments if you don't have Singularity or Docker installed on your system. We recommend to use Singularity instead of Conda. If you don't have Conda on your system, install [Miniconda3](https://docs.conda.io/en/latest/miniconda.html).
67
+
4) (Optional for Conda) Install pipeline's Conda environments if you don't have Singularity or Docker installed on your system. We recommend to use Singularity instead of Conda. If you don't have Conda on your system, install [Miniconda3](https://docs.conda.io/en/latest/miniconda.html).
42
68
```bash
43
69
$ cd chip-seq-pipeline2
70
+
# uninstall old environments (<2.0.0)
71
+
$ bash scripts/uninstall_conda_env.sh
44
72
$ bash scripts/install_conda_env.sh
45
73
```
46
74
47
-
4) Follow [Caper's README](https://github.com/ENCODE-DCC/caper) carefully. Find an instruction for your platform and run `caper init`. Edit the initialized Caper's configuration file (`~/.caper/default.conf`).
48
-
```bash
49
-
$ caper init [YOUR_PLATFORM]
50
-
$ vi ~/.caper/default.conf
51
-
```
52
75
53
76
## Test run
54
77
@@ -63,10 +86,35 @@ The followings are just examples. Please read [Caper's README](https://github.co
63
86
64
87
# Or submit it as a leader job (with long/enough resources) to SLURM (Stanford Sherlock) with Singularity
65
88
# It will fail if you directly run the leader job on login nodes
## Running a pipeline on Terra/Anvil (using Dockstore)
94
+
95
+
Visit our pipeline repo on [Dockstore](https://dockstore.org/workflows/github.com/ENCODE-DCC/chip-seq-pipeline2). Click on `Terra` or `Anvil`. Follow Terra's instruction to create a workspace on Terra and add Terra's billing bot to your Google Cloud account.
96
+
97
+
Download this [test input JSON for Terra](https://storage.googleapis.com/encode-pipeline-test-samples/encode-chip-seq-pipeline/ENCSR000DYI_subsampled_chr19_only.terra.json) and upload it to Terra's UI and then run analysis.
98
+
99
+
If you want to use your own input JSON file, then make sure that all files in the input JSON are on a Google Cloud Storage bucket (`gs://`). URLs will not work.
100
+
101
+
102
+
## Running a pipeline on DNAnexus (using Dockstore)
103
+
104
+
Sign up for a new account on [DNAnexus](https://platform.dnanexus.com/) and create a new project on either AWS or Azure. Visit our pipeline repo on [Dockstore](https://dockstore.org/workflows/github.com/ENCODE-DCC/chip-seq-pipeline2). Click on `DNAnexus`. Choose a destination directory on your DNAnexus project. Click on `Submit` and visit DNAnexus. This will submit a conversion job so that you can check status of it on `Monitor` on DNAnexus UI.
105
+
106
+
Once conversion is done download one of the following input JSON files according to your chosen platform (AWS or Azure) for your DNAnexus project:
You cannot use these input JSON files directly. Go to the destination directory on DNAnexus and click on the converted workflow `chip`. You will see input file boxes in the left-hand side of the task graph. Expand it and define FASTQs (`fastq_repX_R1`) and `genome_tsv` as in the downloaded input JSON file. Click on the `common` task box and define other non-file pipeline parameters.
111
+
112
+
113
+
## Running a pipeline on DNAnexus (using our pre-built workflows)
114
+
115
+
See [this](docs/tutorial_dx_web.md) for details.
116
+
117
+
70
118
71
119
## Input JSON file
72
120
@@ -82,13 +130,6 @@ You can run this pipeline on [truwl.com](https://truwl.com/). This provides a we
82
130
83
131
If you do not run the pipeline on Truwl, you can still share your use-case/job on the platform by getting in touch at [info@truwl.com](mailto:info@truwl.com) and providing your inputs.json file.
84
132
85
-
## Running a pipeline on DNAnexus
86
-
87
-
You can also run this pipeline on DNAnexus without using Caper or Cromwell. There are two ways to build a workflow on DNAnexus based on our WDL.
88
-
89
-
1)[dxWDL CLI](docs/tutorial_dx_cli.md)
90
-
2)[DNAnexus Web UI](docs/tutorial_dx_web.md)
91
-
92
133
## How to organize outputs
93
134
94
135
Install [Croo](https://github.com/ENCODE-DCC/croo#installation). **You can skip this installation if you have installed pipeline's Conda environment and activated it**. Make sure that you have python3(> 3.4.1) installed on your system. Find a `metadata.json` on Caper's output directory.
description: 'ENCODE TF/Histone ChIP-Seq pipeline. See https://github.com/ENCODE-DCC/chip-seq-pipeline2 for more details. e.g. example input JSON for Terra/Anvil.'
0 commit comments