-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dump the glypy.Glycan to linear code #20
Comments
This looks like a breakdown in how Note that LinearCode derived structures are necessarily not canonicalized w.r.t. to the same structure as parsed from GlycoCT or WURCS, and if you intend to mix the two formats, you should explicitly call [1] Banin, E., Neuberger, Y., Altshuler, Y., Halevi, A., Inbar, O., Nir, D., & Dukler, A. (2002). A Novel Linear Code Nomenclature for Complex Carbohydrates. Trends in Glycoscience and Glycotechnology, 14(77), 127–137. https://doi.org/10.4052/tigg.14.127 |
Thanks! I totally agreed that LinearCode derived structures might not be canonicalized and we should be careful when using it. Since my data analysis is only dealing with the glycan with common monosaccharides, the extreme case doesn't both me. The reason the LinearCode is used is that in my case if two glycans are the same, their linearcodes are the same. In this case, the str1==str2 will be faster to check the similarity among a set of glycans. Do you have a faster way to compare if two glycans have same structure? Thanks! |
The same uniqueness is applied to any comparison of canonicalized structures and formats. The I'm not sure I'll have time to fix the LinearCode serialization issue this week. |
Do you mean when we get the GlycoCT from the It's totally okay. There is no push to fix the LinearCode serialization. It all depends on your schedule. |
Yes, The [1] Herget, S., Ranzinger, R., Maass, K., & Lieth, C.-W. V. D. (2008). GlycoCT-a unifying sequence format for carbohydrates. Carbohydrate Research, 343(12), 2162–2171. https://doi.org/10.1016/j.carres.2008.03.011 |
Hey Joshua,
I am afraid that the linearcode.dumps() function might have inconsistency with the linear code rule.
For example, if we have a glycan as below:
We should get 'Ma3(Ma3(Ma6)Ma6)Mb4GNb4GN?'
But the
dumps()
returns 'Ma6(Ma3)Ma6(Ma3)Mb4GNb4GN?'.I believe it only needs a slight modification. When the code traverses the glycan, it need sort the glycan.index descendingly to make the monosaccharide with the highest linkage-index go first.
The text was updated successfully, but these errors were encountered: