Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be creative - create/import workflows #19

Open
bgruening opened this issue Dec 19, 2016 · 35 comments
Open

Be creative - create/import workflows #19

bgruening opened this issue Dec 19, 2016 · 35 comments
Assignees

Comments

@bgruening
Copy link
Owner

We have a variety of tools and visualisations - we should create example workflows for them and cover use-cases.

Can everyone come up with at least on workflow?

@dyusuf dyusuf self-assigned this Dec 21, 2016
@bgruening
Copy link
Owner Author

You can use the already existent Docker image to create a workflow and export it as *.ga file and submit it into this repo. it should be easy as this.

@s-will has already mentioned to have a few ideas as well.

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 24, 2016

@bgruening @s-will I am thinking to use the RNA-seq workflow which was used in our courses.

@bgruening
Copy link
Owner Author

Sounds good. Hopefully all tools are integrated. Can you also create one with one of your tools?

@bgruening
Copy link
Owner Author

@dyusuf: @jfallmann is working on tours for the main training material https://github.com/bgruening/training-material/blob/master/RNA-Seq/tutorials/ref_based.md
I guess it would be nice if we can have a workflow for this one. So that everything is coherent.

@jfallmann
Copy link
Collaborator

@bgruening @dyusuf Think that's a good idea, the tour should be easy to adopt for other datasets later if that's needed

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 27, 2016

@bgruening @jfallmann sure, I do so

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 27, 2016

@bgruening

I try to use subworkflow. Well, after adding a subworkflow, there are no options available for it even though it works fine as an independent workflow.

any hints?

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 27, 2016

@bgruening the subworkflow only works when "copy and insert individual steps"

@bgruening
Copy link
Owner Author

@dyusuf let's use the copy and insert for now. I think this is an upstream bug.
@s-will are you working on a small example ViennaRNA workflow?

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 29, 2016

@bgruening for mapping, how to resolve the issue for the large data like genome.

usually, a galaxy server has built-in index...

@bgruening
Copy link
Owner Author

We could include the reference genomes as well ... but this would blow up the image. Can we give a link to the fasta reference file (UCSC link) to download as well?

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 30, 2016

@bgruening

when I using tophat to do mapping the outputs are empty without errors.

using the same datasets, the Freiburg server generates the outputs with sizes.

could you do a simple mapping with any datasets to see whether the issues can be reproduced.

ps: the reference is not built-in but from history

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 30, 2016

@bgruening

please look at the following two examples. It seems tophat may have the issue on workbench.

workbench, empty outputs: http://192.52.2.98/u/admin/h/unnamed-history

freiburg, outputs with sizes: http://galaxy.uni-freiburg.de/u/dilmurat-yusuf/h/testtophat

@bgruening
Copy link
Owner Author

I guess this is because tophat needs more memory and we need to change the job_conf.xml

@dyusuf
Copy link
Collaborator

dyusuf commented Dec 30, 2016

@bgruening let me know when it is fixed so I can test the workflow.

@dyusuf
Copy link
Collaborator

dyusuf commented Jan 16, 2017

@bgruening I want to integrate dexseq into the workflow, while the installation of the galaxy tool invokes the old dependency resolver where are some problems with the dependencies. I am wondering if there is a version that relies on the conda pkg.

@bgruening
Copy link
Owner Author

@dyusuf I think we have a conda package for it, but I never found the time to update the wrapper :(

@dyusuf dyusuf closed this as completed Jan 16, 2017
@dyusuf dyusuf reopened this Jan 16, 2017
@dyusuf
Copy link
Collaborator

dyusuf commented Jan 16, 2017

@bgruening I am trying to update the dexseq wrappers to use the available conda pkgs.

so far, it went well with dexseq.xml.

for dexseq_count.xml, an environmental variable $DEXSEQ_ROOT is required to access two python scripts in the Dexseq package.

https://github.com/galaxyproject/tools-iuc/blob/643293a896a2ccfac10fe995f48c7f01c1a89a7f/tools/dexseq/dexseq_count.xml#L19

with the old resolver, the variable can be set with set_environment which seems being ignored by the conda resolver.

Do you know how to resolve this issue?

@bgruening
Copy link
Owner Author

I guess the correct solution is to put the python scripts into PREFIX/bin in the conda package?

@dyusuf
Copy link
Collaborator

dyusuf commented Jan 17, 2017

@bgruening yes for sure, that can be done at the level of conda package. I was just wondering any solution from Galaxy so that I need not go to bioconda to fix the recipe :) well, it seems the only solution.

by the way, I found the bioconda recipe is for version 1.18.4 while the conda package is 1.16.6. do you know why the 1.18.4 version was not compiled.

https://github.com/bioconda/bioconda-recipes/blob/master/recipes/bioconductor-dexseq/meta.yaml

https://anaconda.org/bioconda/bioconductor-dexseq/files

@bgruening
Copy link
Owner Author

Probably, just a lack of time on my side :(
There is a hacky way on the Galaxy side, involving which or find ;)

@bgruening
Copy link
Owner Author

Thanks Dili!

@dyusuf
Copy link
Collaborator

dyusuf commented Jan 18, 2017

@bgruening regarding the tophat failure, the following error might be the culprit,

/export/galaxy-central/database/job_working_directory/000/36/conda-env/bin/tophat2: /export/tool_deps/_conda/pkgs/tophat-2.1.0-py35_0/bin/tophat: /opt/anaconda1anaconda2anaconda3/bin/python: bad interpreter: No such file or directory

this might be the issue related to the conda pkg.

@dyusuf
Copy link
Collaborator

dyusuf commented Jan 18, 2017

@bgruening I did planemo test on the galaxy tophat2 tool (in my ubuntu), there was no error with conda dependencies. then is it possible something wrong with docker setting in relation to conda?

@bgruening
Copy link
Owner Author

According to @dpryan79 tophat never worked on python3, so the shebang is probably python2 in this script, which is not available? bioconda/bioconda-recipes#3513

@dyusuf
Copy link
Collaborator

dyusuf commented Jan 19, 2017

@bgruening so what would be the plan to make tophat work? a workbench without this aligner does not sound promising.

@bgruening
Copy link
Owner Author

We should fix it ;)
Can you try again, hopefully it is fixed :)

@bgruening
Copy link
Owner Author

@s-will can you get one workflow/tour up and running?

@bgruening
Copy link
Owner Author

@yhoogstrate any small workflow possible. With a nice description maybe to get a VIs after your WF?
@TorHou can you also please add one.
@bagnacan can the Rostock group please also add one or two ... preferably with the tools already included.

@yhoogstrate
Copy link
Contributor

Are we talking about Rna 2d structure and which deadline? I won't be able to find Time this weekend

@bgruening
Copy link
Owner Author

Yes, 2D vis, or the dotplot, but having one workflow attached to a vis would be nice.
Deadline ... mh, I assume the reviews will be send out in latest one week.

@dyusuf
Copy link
Collaborator

dyusuf commented Mar 8, 2017

@jfallmann need to notify Dockerfile and import_workflows.py about Galaxy-Workflow-AREsite2_CLIP_analysis.ga . otherwise integration won't happen.

@jfallmann
Copy link
Collaborator

@dyusuf Will do as soon as I get the container running for testing again ;)

@TKlingstrom
Copy link

@bebatut was kind enough to create the Ref based workflow for me a while ago which Björn mentioned in an early entry to this thread. I have added a link to it below and will test it on my RNA Workbench docker image (clean except for the update of HTseq-count mentioned here, I will add the update here as soon as the workflow is finished.

The workflow: https://galaxy.uni-freiburg.de/u/tomas/w/imported-ref-based-rna-seq-tutorial-full-workflow

Installation related issues:

  • For some reason uploading the file did not work (server error) but importing it directly from Freiburg worked.
  • Currently the RNA workbench docker lacks MultiQC

@TKlingstrom
Copy link

TKlingstrom commented Mar 12, 2018

Two more issues for running it on the Galaxy RNA workbench

  1. RNA Ref workflow on Freiburg cannot be imported by URL but uploading it as a file worked fine. It also failed to import to Usegalaxy.org.

Failed to open URL: https://galaxy.uni-freiburg.de/u/tomas/w/imported-ref-based-rna-seq-tutorial-full-workflow/json . Exception: No connection adapters were found for

  1. There is no HISAT2 reference genomes available which I thought should be available when running the docker in --priviledged mode.
    docker run -it -d -p 8083:80 --privileged quay.io/bgruening/galaxy-rna-workbench

But it seems to be running well with a reference genome uploaded to history and the workflow modified to run a later HTseq version and the conditional for reference genome edited in the workflow editor. I hope this may be of help after your meeting in Freiburg if you decide to include the workflow later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants