Skip to content

[Proposal] Add a high-level declarative syntax for diagrams#20

Open
anko9801 wants to merge 30 commits intoTypsium:masterfrom
anko9801:master
Open

[Proposal] Add a high-level declarative syntax for diagrams#20
anko9801 wants to merge 30 commits intoTypsium:masterfrom
anko9801:master

Conversation

@anko9801
Copy link

@anko9801 anko9801 commented Sep 8, 2025

This pull request introduces a proposal for a new molecule module, intended to offer a more declarative and concise way to write chemical diagrams.

The goal is to provide a syntax that is even more intuitive and concise than established tools like ChemFig, allowing the natural way of writing a structure to yield an IUPAC preferred diagram.

This module is designed to integrate with core Alchemist functions, enabling a powerful hybrid workflow. While core Alchemist provides ultimate flexibility through direct cetz integration, this new module, at the cost of that flexibility, offers exceptional conciseness for the most common use cases.

Proposed Features

  • Concise Molecule Generation: A new #molecule command is introduced that can take a simple string (e.g., #molecule("CH3-CH2-OH")) and parse it into a diagram.
  • Automatic IUPAC-Preferred Orientation: The parser is designed to automatically orient chemical structures according to IUPAC recommendations to produce aesthetically pleasing and standardized diagrams.
  • Labeling: A labeling system (:label for points, ::label for lines, and a label: "..." argument) is included to allow for the creation of more complex structures with non-sequential bonds or mechanism arrows.

Example Usage

#skeletize(molecule("NH2-CH(-CH3)-C(=O)-OH"))

#skeletize(molecule((
  "CH2:a-CH2-CH2",
  "CH2-CH2-CH2:b",
  ":a=:b"
)))

#skeletize({
  molecule(
    "E-C(=O(lewis: (dots(0), dots(180)))-O:to(lewis: (dots(-45), dots(-135)))-::from-H:H + B:base <=> ",
    "[R-C(=O(lewis: (dots(0), dots(180))))-O(lewis: (dots(0), dots(-90), dots(90))) <-> ",
    "R-C(-O(lewis: (dots(0), dots(90), dots(180))))-O(lewis: (dots(0), dots(-90), dots(90))) <->",
    "R-C(-O(lewis: (dots(0), dots(90), dots(180))))-O(lewis: (dots(-135), dots(45)))]",
    "+ BH",
    ":base(lewis: (dots(180))",
  )

  arrow("->", from: "from", to: "to.north", style: (paint: red))
  arrow("->", from: "base.west", to: "H.east", style: (paint: red))
})

What's implemented

  • Transform system (input -> Node Graph -> Alchemist structure)
    • parser combinator
    • IUPAC-compliant angle calculation
    • Connecting points
    • Resolving labels
  • Atoms (CH3 -> $C$ $H_3$)
    • Charge (NH3+, COO-)
    • Isotope (^13C, ^2H)
  • Bonds (- = # > < :> <: |> <|)
  • Rings and substituents (@6(-=-=-(-CH3)=)-CH3)
  • Option (O(lewis: (dots(0), dots(180))) -(angle: 60deg))
  • Labels (O:O1 =::bond)
  • Error and validations
  • Edge case tests
  • Optimize
  • Manual entry

Refereneces

  1. Brecher, J. Graphical Representation Standards for Chemical Structure Diagrams (IUPAC Recommendations 2008). Pure and Applied Chemistry 2008, 80 (2), 277–410. DOI: 10.1351/pac200880020277

This is an initial proposal, and I would be very grateful for any feedback, suggestions, or critiques to help improve it. Thank you for your consideration.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question: What is the maximum length of a molecule before the max recursion depth is exceeded? I haven't tried yet, I'm just wondering if this is enough for more complicated examples than the one you showed in the PR

Copy link
Author

@anko9801 anko9801 Sep 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point. I've optimized the parser to reduce its stack usage from 13 to 5 depth per nesting level to stay under Typst's hardcoded limit of 80. This allows for up to 11 nesting levels, as any further optimization would come at the cost of code readability and extensibility.

@Robotechnic
Copy link
Collaborator

I'll take time to look at it more in-depth. It seems like awesome work. Thanks !
Before merging, it would be a good idea to add tests and manual entries.

Also, maybe skeletize should be able to directly detect a single string instead of doing #skeletize(molecule(...)), we could have something like #skeletize(...). I have to think about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are made with tytanic. It generates the reference images required for the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants