Empty files as output #11

mokrobial · 2021-09-29T23:37:22Z

Apologies if I have missed a setup step. I installed successfully with conda and was able to run the test data fq without issue. If I input a fastq and run the output is empty. I've tried several different files and it's the same output: 0 vertices. Do reads require some kind of pre-processing first?

Log:
Building deBruijnGraph...
Building deBruijnGraph took 0.485469 seconds.
deBruijnGraph has 0 vertices
Building unitig graph from deBruijn graph...
Getting connected components
Getting CCs took 1.4e-05 seconds
Calculating coverage distribution
Calculating coverage distribution took 3.4e-05 seconds
Unitig graph successfully build in 0.000138 seconds.
Unitig graph has 0 vertices
Assembling...
Cleaning graph
Assembly complete
Assembly took 0.00033 seconds
The complete assembly process took 0.485904 seconds.

AlphaSquad · 2021-09-30T10:04:36Z

The test data works fine but using your own data it does not? Odd.
Could you provide your read-file or a snippet of it? If you did, what value did you provide for k and how long are your reads?

mokrobial · 2021-10-22T22:18:23Z

I didn't set the --k initially. I just tried with it set to 39 and still empty folders. Read length is 2x150

Github won't let me include a zip file so I've put one here:
https://drive.google.com/drive/folders/1SFkD2dDKU1GLpdtEcPoqvLcvGoEYY-fY?usp=sharing

Thanks much!

AlphaSquad · 2021-11-12T14:49:27Z

Hi sorry that it took so long, I have tested the read files you provided and found that for most of the files all the contig lengths were smaller than 500 bp. Haploflow does not report contigs shorter than 500 bp by default, so no contigs were reported.
This might happen because either there are too many strains in the sample - then Haploflow cannot distinguish them by their coverage and avoids misassemblies by breaking contigs apart - or there is no clear signal in the data, because no genome is covered more then let's say 4x or there are too many errors.
Haploflow reports (all) contigs, if the filter option is set to 0, but that probably does not make too much sense.

Ruchank1 · 2021-11-15T23:18:31Z

Hi, I am getting an issue (empty folders, 0 vertices) with the test data file also. Can you please help me with that
Thank you.

AlphaSquad · 2021-11-16T22:39:48Z

Could you post the command you used and the output you received?

Ruchank1 · 2021-11-17T11:02:41Z

Sure.
The command - haploflow --read-file .../forward.fastq --out test --log test/log
The output - was empty sub folders in a folder named test.
and the log file looked like this -
Building deBruijnGraph...
Building deBruijnGraph took 0.00039 seconds.
deBruijnGraph has 0 vertices
Building unitig graph from deBruijn graph...
Getting connected components
Getting CCs took 2.7e-05 seconds
Calculating coverage distribution
Calculating coverage distribution took 6.1e-05 seconds
Unitig graph successfully build in 0.000286 seconds.
Unitig graph has 0 vertices
Assembling...
Cleaning graph
Assembly complete
Assembly took 0.000669 seconds
The complete assembly process took 0.001141 seconds.

The number of vertices is 0.

AlphaSquad · 2021-11-17T12:07:51Z

Haploflow should probably use a meaningful value for k as default, but it seems like this is not working right now. Please re-try running Haploflow with setting a value for k, e.g. --k 41

Ruchank1 · 2021-11-17T12:42:48Z

I tried running the command with setting the k value, but it still shows 0 vertices.

AlphaSquad · 2021-11-17T14:38:26Z

Could you post your forward.fastq? The toy data set is named HIV_3_toy.fq that's why I am asking.

Ruchank1 · 2021-11-17T23:17:28Z

Hi, I actually tried with the HIV_3_toy.fq dataset also, I got the same output. So, I can't really figure out what is happening.

AlphaSquad · 2021-11-18T01:05:47Z

It is odd. The only explanation I have is that Haploflow tries to read a non-existing file. Could you maybe try absolute paths for all files?

Ruchank1 · 2021-11-19T13:01:37Z

Yes, I tried giving absolute paths as well. I am still getting empty files as output. I installed Haploflow using conda, is there a possibility that I missed out on some step?

Ruchank1 · 2021-11-25T11:16:10Z

Hi, I tried it on a linux machine as well but it still gives 0 vertices as output. I cannot really locate the problem.

AlphaSquad · 2021-12-01T10:38:18Z

Hm okay, Haploflow was only tested on UNIX systems, but it is strange that it is not working on a linux machine either. Unfortunately I am not really sure what to do here, since I cannot reproduce this problem.
I will however add a check for missing files, but it may take a moment until this change is done and available on conda (and if no file is missing this does not solve your problem either).

reesea22 · 2023-01-26T22:24:11Z

I have been getting empty files as output for my data as well. When I attempt to run the toy dataset through haploflow I get the following error:
$ haploflow --read-file Haploflow/HIV_3_toy.fq.gz --out test --log test/log
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
Aborted (core dumped)

AlphaSquad · 2023-01-27T09:09:10Z

Are you also using the conda version/install? If yes, can you try to unzip the read file first?

adelizamae · 2023-07-27T02:46:31Z

Hi, I also don't have output files. I'm not sure what I'm doing wrong. :(

I ran:
haploflow --read-file sample.fastq --k 41 --out test/ --log test/log/

But there is no output file except the Cov.tsv

AlphaSquad · 2023-07-27T08:53:42Z

Hi, I am sorry that Haploflow is not working out of the box for you. Unfortunately I will need a little bit more information to give you any feedback (since the command looks ok): Are you using the conda version or did you build Haploflow yourself? What do the log/Cov.tsv files say? How big is your sample.fastq and how long are the reads?

adelizamae · 2023-08-02T02:12:36Z

Hi, I used both the conda version and the build.
Turns out, there are no contigs greater than 500 in length that's why there is no output in mine.

I have SARS-CoV-2 long read sequences (produced by using ONT) and I would like to know what parameters I can use to do de novo assembly.

I uploaded my sample fastq in this gdrive.
https://drive.google.com/drive/folders/1__4TscNV_LJyRbgzjGN-s3ehcB52S5zI?usp=sharing
I'm just starting to learn bioinfo, your help is greatly appreciated!

AlphaSquad added the enhancement label Dec 1, 2021

AlphaSquad self-assigned this Dec 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty files as output #11

Empty files as output #11

mokrobial commented Sep 29, 2021

AlphaSquad commented Sep 30, 2021

mokrobial commented Oct 22, 2021

AlphaSquad commented Nov 12, 2021

Ruchank1 commented Nov 15, 2021

AlphaSquad commented Nov 16, 2021

Ruchank1 commented Nov 17, 2021

AlphaSquad commented Nov 17, 2021

Ruchank1 commented Nov 17, 2021

AlphaSquad commented Nov 17, 2021

Ruchank1 commented Nov 17, 2021

AlphaSquad commented Nov 18, 2021

Ruchank1 commented Nov 19, 2021

Ruchank1 commented Nov 25, 2021

AlphaSquad commented Dec 1, 2021

reesea22 commented Jan 26, 2023

AlphaSquad commented Jan 27, 2023

adelizamae commented Jul 27, 2023

AlphaSquad commented Jul 27, 2023

adelizamae commented Aug 2, 2023

Empty files as output #11

Empty files as output #11

Comments

mokrobial commented Sep 29, 2021

AlphaSquad commented Sep 30, 2021

mokrobial commented Oct 22, 2021

AlphaSquad commented Nov 12, 2021

Ruchank1 commented Nov 15, 2021

AlphaSquad commented Nov 16, 2021

Ruchank1 commented Nov 17, 2021

AlphaSquad commented Nov 17, 2021

Ruchank1 commented Nov 17, 2021

AlphaSquad commented Nov 17, 2021

Ruchank1 commented Nov 17, 2021

AlphaSquad commented Nov 18, 2021

Ruchank1 commented Nov 19, 2021

Ruchank1 commented Nov 25, 2021

AlphaSquad commented Dec 1, 2021

reesea22 commented Jan 26, 2023

AlphaSquad commented Jan 27, 2023

adelizamae commented Jul 27, 2023

AlphaSquad commented Jul 27, 2023

adelizamae commented Aug 2, 2023