A pipeline combining odgi - component_segmentation - Schematize on Docker image or CWL
Docker is needed before running.
git clone https://github.com/graph-genome/pipeline
cd pipeline
docker build -t pipeline .
pip install arvados-cwl-runner
cwltool --cachedir $PWD/cache --parallel graph-genome-previz.cwl example_plain.yml
# for local execution
# or
arvados-cwl-runner graph-genome-previz.cwl example_arvados.yml
Suppose that the input file is "data.gfa".
cp /pass/to/your/data.gfa .
docker run -ti --rm --publish=3000:3000 --volume=`pwd`:/usr/src/app/data pipeline data/data.gfa
docker run -ti --rm --publish=3000:3000 --volume=`pwd`:/usr/src/app/data pipeline data/data.gfa -w 10000
# With -w argument you can change the bin width.
docker run -ti --rm --publish=3000:3000 --volume=`pwd`:/usr/src/app/data pipeline data/data.gfa -w 10000 -s Sn
# With -s argument you can change the sort option.
Access to http://localhost:3000/. The production build of Schematize is running.
Pathindex server works on the same container of Schematize at port 3010. Users need to specify the host of the server.
docker run -ti --rm \
--publish=3000:3000 \ # For Schematize server
--publish=3010:3010 \ # For odgi server (*)
--volume=`pwd`:/usr/src/app/data pipeline data/data.gfa -w 10000 -s Sn \
--port 3010 \ # The host's port to expose the odgi server, the same as the host port of (*).
--host localhost # The host name to expose the odgi server.
If you change the server to example.com:3020
to expose odgi server, then
docker run -ti --rm \
--publish=3000:3000 \ # For Schematize server
--publish=3020:3010 \ # For odgi server (*)
--volume=`pwd`:/usr/src/app/data pipeline data/data.gfa -w 10000 -s Sn \
--port 3020 \ # The host's port to expose the odgi server, the same as the host port of (*).
--host "example.com" # The host name to expose the odgi server.
You can change the options on odgi / Schematize.
- gfa name (first argument, mandatory)
-w
: the bin width onodgi
(optional, default:1000
)-s
: the sort option onodgi sort
(optional, default:bSnSnS
)-t
: the threads option onodgi
(optional, default:12
)-c
: the cells-per-file option oncomponent_segmentation
(optional)-i
: the host ofodgi index
(optional, default:localhost
)
The full list of the argument is as follows:
docker run -ti --rm --publish=3000:3000 --volume=`pwd`:/usr/src/app/data pipeline -h
git clone https://github.com/graph-genome/component_segmentation # For debugging component_segmentation
git clone https://github.com/graph-genome/Schematize # For debugging Schematize
docker run -d --publish=3000:3000 --publish=3010:3010 --volume=`pwd`:/usr/src/app/data --volume=`pwd`/Schematize:/usr/src/app/Schematize --volume=`pwd`/component_segmentation:/usr/src/app/component_segmentation pipeline data/data.gfa -w 1000 -s s -c 10000
Then, the pipeline is running through cloned component_segmentation and Schematize. Docker container is failed, but the output json file is stored on Schematize directory. Therefore just run yarn start
on Schematize
directory works.