CAFE5 run is taking forever #100

zeyak · 2022-08-10T08:09:17Z

zeyak
Aug 10, 2022

I've been running CAFE5 with around 12,000 Orthogroups on 4 species by using 32 CPUs. It's been a week and it is still searching for lambda parameters. I haven't played with any values for lambda or gamma, to be able to see the vanilla run with the CAFE. Is that the reason it is taking forever? or, is there a way to estimate the running time? In the docs, it says 200 Lambda values will be run for 10,000 orthogroups and I could only come to about 70 lambda values far in a week.

Here is how I ran the code:

Thanks in advance!

Answered by hahnlab-user

Aug 14, 2022

Yes, the fact that you're getting "inf" likelihoods from the very beginning means that no likelihood can be calculated. You may have to further filter your data, before running CAFE.

View full answer

benfulton · 2022-08-10T12:56:22Z

benfulton
Aug 10, 2022
Maintainer

No, it won't take a week. It's clearly locked up somewhere - likely in a thread race issue. Try it with just 8 threads to try to minimize that possibility. I would allow an hour per 1000 groups in that case, no more.

0 replies

zeyak · 2022-08-12T13:42:38Z

zeyak
Aug 12, 2022
Author

I see but still, it didn't help with my case. I am suspicious of my tree file as I've used the direct output from Orthofinder without trying to make it an ultrametric tree. I suppose CAFE5 can handle that, right?

My tree file looks like this:

(kbiala:0.476202,((spiro:0.575389,(HIN:0.33093,trepo:0.374071)0.867084:0.201903)0.809226:0.198319,(muris:0.401597,wb:0.413952)0.97889:0.327973)1:0.476202);

or should I fix the lambda value to 1 with -l parameter?

0 replies

hahnlab-user · 2022-08-12T14:02:26Z

hahnlab-user
Aug 12, 2022
Maintainer

Your tree should be ultrametric, though the program will not throw an error if it's not.

0 replies

zeyak · 2022-08-13T07:57:20Z

zeyak
Aug 13, 2022
Author

Okay, I've made an ultrametric tree and used only 8 cores but still, Cafe has been running for 17 hours with 14 000 orthogroups.

Now my ultrametric tree looks like this:

(kbiala:1.2505,((spiro:0.625248,(HIN:0.312624,trepo:0.312624)1:0.312624)1:0.312624,(muris:0.468936,wb:0.468936)1:0.468936)1:0.312624);

0 replies

benfulton · 2022-08-13T22:36:04Z

benfulton
Aug 13, 2022
Maintainer

Roughly how many lambdas has it generated, and how different are the lambdas near the end? If they aren't that different, there are parameters that would have it stop without trying to get the high precision that is the default.

Alternatively, if you're happy with the lambdas it's creating, you can simply generate a reconstruction by specifying the lambda in the command line.

0 replies

zeyak · 2022-08-14T09:34:57Z

zeyak
Aug 14, 2022
Author

So far it has generated around 45 lambda values and doesn't seem to be approaching any particular value. Below are the lambda values near the end:

Why do you think that is the case? I have one species with a highly duplicated genome and that might be the reason. In the CAFE tutorial, it is mentioned that filtration needs to be performed by a script:

"Gene families that have large gene copy number variance can cause parameter estimates to be non-informative. You can remove gene families with large variance from your dataset, but we found that putting aside the gene families in which one or more species have ≥ 100 gene copies does the trick."

Is the filtration process already involved in the CAFE5 script or do I have to run the script priorly?

Actually, I'm not sure which lambda value I should choose. Isn't that supposed to be calculated by the software? or Can I just choose a lambda value by try and error?

0 replies

hahnlab-user · 2022-08-14T17:02:55Z

hahnlab-user
Aug 14, 2022
Maintainer

Yes, the fact that you're getting "inf" likelihoods from the very beginning means that no likelihood can be calculated. You may have to further filter your data, before running CAFE.

0 replies

iaunicorn · 2023-04-07T09:16:32Z

iaunicorn
Apr 7, 2023

I also have the same problem. It took one week and still was running.
My command is: CAFE5/bin/cafe5 -i ./gene_families_filter.txt -t ./ultrametric_cafe.txt -l 0.0001 -c 10.
Is there any problem?

9 replies

iaunicorn Apr 7, 2023

By the way, if I use cafe5 installed by Conda to run my files, I can get results. But I can't get the file named Base_report.cafe.

benfulton Apr 7, 2023
Maintainer

Please mail them to me at befulton at iu.edu.

We haven't updated Conda yet. That should happen in the next few weeks.

iaunicorn Apr 7, 2023

Thank you so much. Please check the email.

benfulton Apr 7, 2023
Maintainer

From looking at your data, your tree appears to be very large - you should try splitting the tree into subtrees and running them each separately.

There is a bug in CAFE here, but it involves trying to calculate P-Values where the -lnL of your lambda is infinite. I don't think you can get any useful data from the reconstructions or P-Values in that situation. I'm going to have CAFE stop attempting to calculate those values.

iaunicorn Apr 13, 2023

thanks a lot. I will try to use subtrees to run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAFE5 run is taking forever #100

{{title}}

Replies: 8 comments 9 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

CAFE5 run is taking forever #100

zeyak Aug 10, 2022

Replies: 8 comments · 9 replies

benfulton Aug 10, 2022 Maintainer

zeyak Aug 12, 2022 Author

hahnlab-user Aug 12, 2022 Maintainer

zeyak Aug 13, 2022 Author

benfulton Aug 13, 2022 Maintainer

zeyak Aug 14, 2022 Author

hahnlab-user Aug 14, 2022 Maintainer

iaunicorn Apr 7, 2023

iaunicorn Apr 7, 2023

benfulton Apr 7, 2023 Maintainer

iaunicorn Apr 7, 2023

benfulton Apr 7, 2023 Maintainer

iaunicorn Apr 13, 2023

zeyak
Aug 10, 2022

Replies: 8 comments 9 replies

benfulton
Aug 10, 2022
Maintainer

zeyak
Aug 12, 2022
Author

hahnlab-user
Aug 12, 2022
Maintainer

zeyak
Aug 13, 2022
Author

benfulton
Aug 13, 2022
Maintainer

zeyak
Aug 14, 2022
Author

hahnlab-user
Aug 14, 2022
Maintainer

iaunicorn
Apr 7, 2023

benfulton Apr 7, 2023
Maintainer

benfulton Apr 7, 2023
Maintainer