Skip to content

Commit a6f765a

Browse files
Merge pull request #12 from wtsi-hgi/dev2
Merge latest development into master
2 parents cb28aba + 8dba116 commit a6f765a

17 files changed

+849
-494
lines changed

CHANGELOG

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
v1.4
2+
- Added no_docker command line option, changed docker_container_name to docker_image_name
3+
- Fixed beheviour of array and tag types
4+
- Rewrote gen_cwl_arg to be clearer
5+
- Reduced the dependance on the command line for the tests
6+
17
v1.3
28
- Changed the docker container to be broad institute's official docker container, not wtsi-hgi own container
39

README.md

Lines changed: 26 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Generates [CWL](http://www.commonwl.org/v1.0/) files from the [GATK documentatio
44

55
## Installation
66

7-
First, install the module
7+
First, install the module
88
```bash
99
git clone https://github.com/wtsi-hgi/gatk-cwl-generator
1010
cd gatk-cwl-generator
@@ -13,38 +13,44 @@ python setup.py install
1313

1414
You may also want to install [cwltool](https://github.com/common-workflow-language/cwltool) to run the generated CWL files
1515

16+
## Requirements
17+
18+
- Python 3
19+
- Docker and node.js for the tests
20+
1621
## Usage
1722

1823
```
19-
usage: gatkcwlgenerator [-h] [--version GATKVERSION] [--out OUTPUTDIR]
20-
[--include INCLUDE_FILE] [--dev]
21-
[--docker_container_name DOCKER_CONTAINER_NAME]
22-
[--gatk_location GATK_LOCATION]
24+
usage: gatk_cwl_generator [-h] [--version VERSION] [--out OUTPUT_DIR]
25+
[--include INCLUDE] [--dev] [--no_docker]
26+
[--docker_image_name DOCKER_IMAGE_NAME]
27+
[--gatk_command GATK_COMMAND]
2328
2429
Generates CWL files from the GATK documentation
2530
2631
optional arguments:
2732
-h, --help show this help message and exit
28-
--version GATKVERSION, -v GATKVERSION
33+
--version VERSION, -v VERSION
2934
Sets the version of GATK to parse documentation for.
30-
Default is 3.5
31-
--out OUTPUTDIR, -o OUTPUTDIR
35+
Default is 3.5-0
36+
--out OUTPUT_DIR, -o OUTPUT_DIR
3237
Sets the output directory for generated files. Default
3338
is ./gatk_cmdline_tools/<VERSION>/
34-
--include INCLUDE_FILE
35-
Only generate this file (note, CommandLinkGATK has to
39+
--include INCLUDE Only generate this file (note, CommandLinkGATK has to
3640
be generated for v3.x)
3741
--dev Enable network caching and overwriting of the
3842
generated files (for development purposes). Requires
3943
requests_cache to be installed
40-
--docker_container_name DOCKER_CONTAINER_NAME, -c DOCKER_CONTAINER_NAME
41-
Docker container name for generated cwl files. Default
42-
is 'broadinstitute/gatk3:<VERSION>' for version 3.x and
44+
--no_docker Make the generated CWL files not use docker
45+
containers. Default is False.
46+
--docker_image_name DOCKER_IMAGE_NAME, -c DOCKER_IMAGE_NAME
47+
Docker image name for generated cwl files. Default is
48+
'broadinstitute/gatk3:<VERSION>' for version 3.x and
4349
'broadinstitute/gatk:<VERSION>' for 4.x
44-
--gatk_location GATK_LOCATION, -l GATK_LOCATION
45-
Location of the gatk jar file. Default is
46-
'/usr/GenomeAnalysisTK.jar' for gatk 3.x and
47-
'/gatk/gatk.jar' for gatk 4.x
50+
--gatk_command GATK_COMMAND, -l GATK_COMMAND
51+
Command to launch GATK. Default is 'java -jar
52+
/usr/GenomeAnalysisTK.jar' for gatk 3.x and 'java -jar
53+
/gatk/gatk.jar' for gatk 4.x
4854
```
4955

5056
This has been tested on versions 3.5-3.8 and generates files for version 4 (though some parameters are unknown and default to outputting a string).
@@ -82,14 +88,10 @@ The generated CWL files can also be found in the [releases](https://github.com/w
8288

8389
## Tests
8490

85-
First install the test requirements
86-
```
87-
pip install -r test_requirements.txt
88-
```
89-
Then add example data to `cwl-example-data` such that `examples/HaplotypeCaller_inputs.yml` will run, then:
90-
91+
Install the tests requirements, then run the tests. Note: docker must be installed in order to run the tests (the cwl files are tested during the tests):
9192
```bash
92-
py.test -v gatkcwlgenerator/tests/test.py
93+
pip install -r test_requirements.txt
94+
pytest gatkcwlgenerator
9395
```
9496

9597
You can also run the tests in parallel with `-n` to improve performance

build.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ VERSIONS=( 3.5-0 3.6-0 3.7-0 3.8-0 4.beta.6 )
1010
tarbase="gatk-cwl-generator-${generator_version}-gatk_cmdline_tools"
1111

1212
tmpdir=$(mktemp -d)
13-
python_bin=$(which python)
13+
python_bin=$(which python3)
1414
echo "Using ${python_bin} to generate temporary virtualenv ${tmpdir}/venv"
1515
set -x
1616
${python_bin} -m virtualenv "${tmpdir}/venv"
@@ -33,7 +33,7 @@ for ver in ${VERSIONS[@]}
3333
do
3434
echo "Generating CWL for GATK version ${ver}"
3535
set -x
36-
PYTHONPATH=. python gatkcwlgenerator -v ${ver} -o "${builddir}/${ver}" "$@"
36+
PYTHONPATH=. python -m gatkcwlgenerator -v ${ver} -o "${builddir}/${ver}" "$@"
3737
set +x
3838
done
3939

examples/HaplotypeCaller_inputs.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
reference_sequence:
22
class: File
33
#path: /path/to/fasta/ref/file
4-
path: ../cwl-example-data/chr22_cwl_test.fa
4+
path: ../cwl-example-data/chr22_cwl_test.fa
55
refIndex:
66
class: File
77
#path: /path/to/index/file
8-
path: ../cwl-example-data/chr22_cwl_test.fa.fai
8+
path: ../cwl-example-data/chr22_cwl_test.fa.fai
99
refDict:
1010
class: File
1111
#path: /path/to/dict/file
12-
path: ../cwl-example-data/chr22_cwl_test.fa.dict
12+
path: ../cwl-example-data/chr22_cwl_test.fa.dict
1313
input_file: #must be BAM or CRAM
1414
class: File
15-
#path: /path/to/input/file
15+
#path: /path/to/input/file
1616
path: ../cwl-example-data/chr22_cwl_test.cram
1717
out: out.gvcf.gz
1818
intervals: [chr22:10591400-10591600]

gatkcwlgenerator/VERSION

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
1.4.0

gatkcwlgenerator/__init__.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
11
from .main import *
22
import gatkcwlgenerator.json2cwl
3-
import gatkcwlgenerator.gen_cwl_arg
3+
import gatkcwlgenerator.gen_cwl_arg
4+
5+
import os.path as path
6+
7+
with open(path.join(path.dirname(__file__), "VERSION"), "r") as _version_file:
8+
__version__ = _version_file.read()

gatkcwlgenerator/__main__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
import main
1+
from .main import cmdline_main
22

3-
main.main()
3+
cmdline_main()

gatkcwlgenerator/cwl_ast.py

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
"""
2+
Classes to make an AST for CWL types
3+
"""
4+
import abc
5+
6+
class CWLType:
7+
__metaclass__ = abc.ABCMeta
8+
9+
@abc.abstractmethod
10+
def get_cwl_object(self):
11+
pass
12+
13+
def is_array_type(self):
14+
return False
15+
16+
@property
17+
def children(self):
18+
if hasattr(self, "inner_type"):
19+
return [getattr(self, "inner_type")]
20+
else:
21+
raise AttributeError(repr(self) + " has no children")
22+
23+
def find_node(self, predicate):
24+
"""
25+
Traverses the AST to find a node that satifies the given predicate.
26+
If the no node is found, returns None
27+
"""
28+
if predicate(self):
29+
return self
30+
else:
31+
try:
32+
return next(filter(None, (child.find_node(predicate) for child in self.children)))
33+
except (AttributeError, StopIteration):
34+
return None
35+
36+
37+
class CWLBasicType(CWLType):
38+
def __init__(self, name):
39+
self.name = name
40+
41+
def get_cwl_object(self):
42+
return self.name
43+
44+
45+
class CWLArrayType(CWLType):
46+
def __init__(self, inner_type):
47+
self.inner_type = inner_type
48+
self._input_binding = None
49+
50+
def is_array_type(self):
51+
return True
52+
53+
def add_input_binding(self, inputBinding):
54+
self._input_binding = inputBinding
55+
56+
def get_cwl_object(self):
57+
inner_cwl_object = self.inner_type.get_cwl_object()
58+
59+
if isinstance(inner_cwl_object, str) and self._input_binding is None:
60+
return inner_cwl_object + "[]"
61+
else:
62+
cwl_object = {
63+
"type": "array",
64+
"items": self.inner_type.get_cwl_object()
65+
}
66+
67+
if self._input_binding is not None:
68+
cwl_object["inputBinding"] = self._input_binding
69+
70+
return cwl_object
71+
72+
73+
class CWLUnionType(CWLType):
74+
def __init__(self, *items):
75+
self.items = items
76+
77+
def get_cwl_object(self):
78+
cwl_object = []
79+
80+
for item in self.items:
81+
if isinstance(item, CWLUnionType):
82+
cwl_object.extend(item.get_cwl_object())
83+
else:
84+
cwl_object.append(item.get_cwl_object())
85+
86+
return cwl_object
87+
88+
@property
89+
def children(self):
90+
return self.items
91+
92+
93+
class CWLEnumType(CWLType):
94+
def __init__(self, symbols):
95+
self.symbols = symbols
96+
97+
def get_cwl_object(self):
98+
return {
99+
"type": "enum",
100+
"symbols": self.symbols
101+
}
102+
103+
class CWLOptionalType(CWLType):
104+
def __init__(self, inner_type):
105+
self.inner_type = inner_type
106+
107+
def is_array_type(self):
108+
return self.inner_type.is_array_type()
109+
110+
def get_cwl_object(self):
111+
inner_cwl_object = self.inner_type.get_cwl_object()
112+
113+
if isinstance(inner_cwl_object, str):
114+
return inner_cwl_object + "?"
115+
elif isinstance(inner_cwl_object, list):
116+
return ["null"] + inner_cwl_object
117+
else:
118+
return [
119+
"null",
120+
inner_cwl_object
121+
]

0 commit comments

Comments
 (0)