Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The process of molecular production is too slow. #20

Open
JackAILab opened this issue Feb 21, 2023 · 12 comments
Open

The process of molecular production is too slow. #20

JackAILab opened this issue Feb 21, 2023 · 12 comments

Comments

@JackAILab
Copy link

image

The process of molecular production is too slow. It takes about 2-4 hours to form a molecule. As shown in the figure, 5,000 time steps are the diffusion process of a molecule. One step actually takes about 12s-24s, which means that the formation of a molecule takes 2-4 hours. Is there any way to improve it? Or is there a failure of my server? How long does it take you to generate molecules?

Looking forward to and thank you for your reply!

@MinkaiXu
Copy link
Owner

MinkaiXu commented Apr 5, 2023

Hi,
Sorry for the late reply. I think this is unnormal, where a single step takes 12-24s. I suppose GPU might not be correctly activated?

@JackAILab
Copy link
Author

JackAILab commented Apr 5, 2023 via email

@JackAILab
Copy link
Author

JackAILab commented Apr 5, 2023

Dear author, I also find it difficult for me to get corresponding 3D results from the generated SMILES files(.pkl files). The ReadMe file you provided mentioned that it seems that the SMILES file needs to use the sampling function in the training file, but I find this difficult to achieve. Is there any function file or advice for visualization that you can provide?

Thank you for your help

WechatIMG1164

@MinkaiXu
Copy link
Owner

MinkaiXu commented Apr 5, 2023

Hi,

For visualization, we just use the software called PyMol, which can render molecular structures.
Sorry that this is an entire software and we couldn't provide all guidelines for using the software.
We suggest you can refer to their official document for full details (https://pymol.org/dokuwiki/).

Thanks!

@JackAILab
Copy link
Author

Hi,

For visualization, we just use the software called PyMol, which can render molecular structures. Sorry that this is an entire software and we couldn't provide all guidelines for using the software. We suggest you can refer to their official document for full details (https://pymol.org/dokuwiki/).

Thanks!

But the pymol seems to be unable to handle binary files like pkl. Maybe I didn't notice other handler functions?

Let me confirm that you only need to use PYMOL to visualize the pkl file generated from the test file as the corresponding 3D structure, right?

Thank you very much for your patient answer~~

@MinkaiXu
Copy link
Owner

MinkaiXu commented Apr 5, 2023

Oh, not exactly the same as the test file. The molecules are stored as some tensors in the generated test file, while you need to reformat them (or a few of the interested ones) to rdkit.mol format.
But they can still be just saved in .pkl format and PyMol is able to directly read it ( pickled ChemPy models with a ".pkl" can also be directly read, https://pymolwiki.org/index.php/Load).

@JackAILab
Copy link
Author

OK!Thank you very much for your detailed reply!
So, now, the key problem is that, how can I "reformat them (or a few of the interested ones) to rdkit.mol format."

Do I need to use the sampling function in the model?

I refer to the solution in this issue (DeepGraphLearning/ConfGF#1) to realize the conversion of pkl files to sdf files, but I found that this requires the use of sampling functions. It seems that it is not easy to directly convert the pkl file to rdkit.mol format and then open it directly with pymol.

@MinkaiXu
Copy link
Owner

MinkaiXu commented Apr 5, 2023

If you just want to do visualization (of data in the test set), this should be unnecessary.
I suggest you can load the .pkl and take a look --- I remember there is rdmol for each molecule. You can just take it.

@JackAILab
Copy link
Author

如果你只是想做可视化(测试集中的数据),这应该是必要的。 我建议您可以加载 .pkl 并查看 --- 我记得每个分子都有 rdmol。你可以接受它。

I want to visualize the molecules generated by your Geodiff. Of course, the visualization of the molecules in the test set is also needed~

@MinkaiXu
Copy link
Owner

MinkaiXu commented Apr 5, 2023

Sorry, there is a typo in my previous message (necessary->unnecessary)...

Just load both and take a look --- there should be rdmol in pyg_data for both cases.
For GeoDiff generated data, remember to set the position as generated position.
https://github.com/MinkaiXu/GeoDiff/blob/main/utils/chem.py#L48-L56

You can take a closer look at all the .pkl files. You can also look at the sampling code to better understand what we saved for generated mols. Sorry that it's also hard for me to remember all the engineering details...

@JackAILab
Copy link
Author

Thanks again for your patient reply!

I will try this solution and update my solution in time.

@den-run-ai
Copy link

@MinkaiXu how much time each step should take?

@JackAILab how did you resolve the slow computation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants