Aves use case scalability challenge #8

nfranz · 2015-12-23T21:02:15Z

For the current Aves use case, we have single, working input datasets for the entire use case that extend from the root (Class) to the Order level, and also to the Family level. However, at present we seemingly cannot scale to the species level with a single input file and using Euler/X default reasoners, meaning that we need to partition that root-to-species level file into two complementary datasets, provisionally called (each of these is consistent and "solvable"):

(1) 2015-Pala_Neoa_Grade_Species_Complete.txt and
(2) 2015-Acci_Aust_Clade_Species_Complete.txt

Originally there were three species-level partitions (each of these also completes well):

(A) 2015-Pala_Gall_Grade-Species-Complete.txt => 6 kb
(B) 2015-Neoaves-Part-Species-Complete.txt => 22 kb
(C) 2015-Acci_Aust_Clade-Species-Complete.txt => 23 kb

(2) and (C) above are identifical.
(1) above is a merge of (A) and (B), with 174 x 409 and 71,166 MIR. Running (1) on my laptop with "euler2 align" took 10.5 hours but was successful. However, running a merge of (B) and (C) above - called..

(3) 2015-Neoaves-All-Species-Cannot-Process.txt

..produced an "inconsistent/repair" output, I believe also after more than 8-10 hours (overnight). This might mean - assuming that the (3) merge is actually consistent (it should be), that our scalability limits are currently in the interval/complexity range between (1) and (3).

The aforementioned input files, and the successful 10.5 hour run of (1) are in the following DropBox folder:

Dropbox/Euler-Runs/BirdPhylogenies/Scalability-Challenge

Issues:

(i) Can others replicate these results?
(ii) Can we overcome the challenge of scaling to the level of complexity of (3), either with conventional or with custom reasoners?
(iii) Notice that "no coverage" is used 85 times in (3); to account for differential species-level sampling across the two input trees.

nfranz added bug enhancement help wanted question and removed help wanted question labels Dec 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aves use case scalability challenge #8

Aves use case scalability challenge #8

nfranz commented Dec 23, 2015

Aves use case scalability challenge #8

Aves use case scalability challenge #8

Comments

nfranz commented Dec 23, 2015