Skip to content

Commit

Permalink
add documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
klausweinbauer committed Aug 22, 2024
1 parent e28d185 commit ee60c80
Show file tree
Hide file tree
Showing 12 changed files with 292 additions and 29 deletions.
30 changes: 30 additions & 0 deletions doc/figure_scripts/multiple_anchor_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import matplotlib.pyplot as plt
from fgutils.proxy import MolProxy, ProxyGroup, ProxyGraph, Parser
from fgutils.vis import plot_graph

pattern = "C{g}(C)C{g}(C)(C)C"
g_2 = ProxyGroup("g", ProxyGraph("c1ccccc1", anchor=[1, 3]))
g_3 = ProxyGroup("g", ProxyGraph("c1ccccc1", anchor=[1, 3, 4]))
g_4 = ProxyGroup("g", ProxyGraph("c1ccccc1", anchor=[1, 3, 4, 5]))

parser = Parser()
proxy1 = MolProxy(pattern, g_2)
proxy2 = MolProxy(pattern, g_3)
proxy3 = MolProxy(pattern, g_4)

parent_graph = parser(pattern)
mol1 = next(proxy1)
mol2 = next(proxy2)
mol3 = next(proxy3)

fig, ax = plt.subplots(2, 2, dpi=200, figsize=(16, 9))
plot_graph(
parent_graph, ax[0, 0], show_labels=True, show_edge_labels=False, title="parent"
)
plot_graph(mol1, ax[0, 1], show_edge_labels=False, title="2 anchor nodes")
plot_graph(mol2, ax[1, 0], show_edge_labels=False, title="3 anchor nodes")
plot_graph(mol3, ax[1, 1], show_edge_labels=False, title="4 anchor nodes")
plt.savefig(
"doc/figures/multiple_anchor_example.png", bbox_inches="tight", transparent=True
)
plt.show()
5 changes: 5 additions & 0 deletions doc/figure_scripts/print_fg_tree.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from fgutils.fgconfig import print_tree, FGConfigProvider, FGTreeNode

provider = FGConfigProvider()
tree = provider.get_tree()
s = print_tree(tree)
Binary file added doc/figures/multiple_anchor_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
57 changes: 57 additions & 0 deletions doc/functional_groups.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
=================
Functional Groups
=================

FGUtils provides a class :py:class:`~fgutils.query.FGQuery` to query a
molecules functional groups.

Functional group tree
=====================

.. code-block::
Functional Group Parents Pattern
--------------------------------------------------------------------------
ether [ROOT] ROR
├── ketal [ether] RC(OR)(OR)R
│ ├── acetal [ketal] RC(OC)(OC)H
│ └── hemiketal [ketal, alcohol] RC(OH)(OR)R
│ └── hemiacetal [hemiketal] RC(OC)(OH)H
├── ester [ketone, ether] RC(=O)OR
│ ├── anhydride [ester] RC(=O)OC(=O)R
│ ├── peroxy_acid [ester, peroxide] RC(=O)OOH
│ ├── carbamate [ester, amide] ROC(=O)N(R)R
│ └── carboxylic_acid [ester, alcohol] RC(=O)OH
├── alcohol [ether] COH
│ ├── hemiketal [ketal, alcohol] RC(OH)(OR)R
│ │ └── hemiacetal [hemiketal] RC(OC)(OH)H
│ ├── carboxylic_acid [ester, alcohol] RC(=O)OH
│ ├── enol [alcohol] C=COH
│ ├── primary_alcohol [alcohol] CCOH
│ │ └── secondary_alcohol [primary_alcohol] C(C)(C)OH
│ │ └── tertiary_alcohol [secondary_alcohol] C(C)(C)(C)OH
│ └── phenol [alcohol] C:COH
└── peroxide [ether] ROOR
└── peroxy_acid [ester, peroxide] RC(=O)OOH
thioether [ROOT] RSR
└── thioester [ketone, thioether] RC(=O)SR
amine [ROOT] RN(R)R
├── amide [ketone, amine] RC(=O)N(R)R
│ └── carbamate [ester, amide] ROC(=O)N(R)R
└── anilin [amine] C:CN(R)R
carbonyl [ROOT] C(=O)
├── ketene [carbonyl] RC(R)=C=O
└── ketone [carbonyl] RC(=O)R
├── amide [ketone, amine] RC(=O)N(R)R
│ └── carbamate [ester, amide] ROC(=O)N(R)R
├── thioester [ketone, thioether] RC(=O)SR
├── ester [ketone, ether] RC(=O)OR
│ ├── anhydride [ester] RC(=O)OC(=O)R
│ ├── peroxy_acid [ester, peroxide] RC(=O)OOH
│ ├── carbamate [ester, amide] ROC(=O)N(R)R
│ └── carboxylic_acid [ester, alcohol] RC(=O)OH
├── acyl_chloride [ketone] RC(=O)Cl
└── aldehyde [ketone] RC(=O)H
nitrose [ROOT] RN=O
└── nitro [nitrose] RN(=O)O
nitrile [ROOT] RC#N
57 changes: 51 additions & 6 deletions doc/pattern_syntax.rst → doc/graph_syntax.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
==============
Pattern Syntax
==============
============
Graph Syntax
============

FGUtils has its own graph description language. The syntax is closely related
to the SMILES format for molecules and reactions. It is kind of an extenstion
Expand All @@ -26,9 +26,11 @@ obtained as follows::
Besides parsing common SMILES it is possible to generate molecule-like graphs
with more abstract nodes, i.e., arbitrary node labels. Arbitrary node labels
are surrounded by ``{}`` (e.g. ``{label}``). This abstract labeling can be used
to substitute nodes with specific patterns. This can be done by using a
:py:class:`~fgutils.proxy.Proxy`. Propyl acetate can be created by replacing
the labeled node with the propyl group::
to substitute nodes with specific patterns. In this context the labels are
group names of :py:class:`~fgutils.proxy.ProxyGroup` objects. A ProxyGroup
defines a set of sub-graphs to be replaced for the labeled node. This can be
done by using a :py:class:`~fgutils.proxy.Proxy`. Propyl acetate can be created
by replacing the labeled node with the propyl group::

import matplotlib.pyplot as plt
from fgutils import Parser
Expand Down Expand Up @@ -57,6 +59,49 @@ the labeled node with the propyl group::
A node can have more than one label. This can be done by separating the
labels with a comma, e.g.: ``{label_1,label_2}``.

In the example above the ProxyGroup has only one subgraph pattern. In general,
a ProxyGroup is a collection of several possible subgraphs from which one is
selected when a new sample is instantiated (currently only random selection is
implemented). By default a pattern has one anchor at index 0. If you need more
control over how a subgraph is inserted into a parent graph you can instantiate
the :py:class:`~fgutils.proxy.ProxyGraph` class. For a ProxyGraph you can
provide a list of anchor node indices. The insertion of the subgraph into the
parent depends on the number of anchor nodes in the subgraph and the number of
edges to the labeled node in the parent. The first edge in the parent connects
to the first anchor node in the subgraph and so forth. The following example
demonstrates the insertion with multiple anchor nodes::

import matplotlib.pyplot as plt
from fgutils.proxy import MolProxy, ProxyGroup, ProxyGraph, Parser
from fgutils.vis import plot_graph

pattern = "C{g}(C)C{g}(C)(C)C"
g_2 = ProxyGroup("g", ProxyGraph("c1ccccc1", anchor=[1, 3]))
g_3 = ProxyGroup("g", ProxyGraph("c1ccccc1", anchor=[1, 3, 4]))
g_4 = ProxyGroup("g", ProxyGraph("c1ccccc1", anchor=[1, 3, 4, 5]))

parser = Parser()
proxy1 = MolProxy(pattern, g_2)
proxy2 = MolProxy(pattern, g_3)
proxy3 = MolProxy(pattern, g_4)

parent_graph = parser(pattern)
mol1 = next(proxy1)
mol2 = next(proxy2)
mol3 = next(proxy3)

fig, ax = plt.subplots(2, 2, dpi=200, figsize=(16, 9))
plot_graph(
parent_graph, ax[0, 0], show_labels=True, show_edge_labels=False, title="parent"
)
plot_graph(mol1, ax[0, 1], show_edge_labels=False, title="2 anchor nodes")
plot_graph(mol2, ax[1, 0], show_edge_labels=False, title="3 anchor nodes")
plot_graph(mol3, ax[1, 1], show_edge_labels=False, title="4 anchor nodes")
plt.show()

.. image:: figures/multiple_anchor_example.png
:width: 1000

Another extension to the SMILES notation is the encoding of bond changes. This
feature is required to model reaction mechanisms as ITS graph. Changing bonds
are surrounded by ``<>`` (e.g. ``<1, 2>`` for the formation of a double bond
Expand Down
3 changes: 2 additions & 1 deletion doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ Welcome to FGUtils's documentation!
.. toctree::
:maxdepth: 1

pattern_syntax
functional_groups
graph_syntax
references


Expand Down
6 changes: 6 additions & 0 deletions doc/references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@
References
==========

query
=====

.. automodule:: fgutils.query
:members:

parse
=====

Expand Down
69 changes: 52 additions & 17 deletions fgutils/fgconfig.py
Original file line number Diff line number Diff line change
Expand Up @@ -180,26 +180,61 @@ def search_parents(
return None if len(parents) == 0 else list(parents)


def print_tree(roots: list[FGTreeNode]):
def _print(node: FGTreeNode, indent=0):
print(
"{}{:<{width}}{:<40} {}".format(
indent * " ",
node.fgconfig.name,
"[Parents: {}]".format(
", ".join([p.fgconfig.name for p in node.parents])
if len(node.parents) > 0
else "ROOT"
),
node.fgconfig.pattern_str,
width=30 - indent,
)
def tree2str(roots: list[FGTreeNode]):
sym = {"branch": "├── ", "skip": "│ ", "end": "└── ", "empty": " "}

def _print(node: FGTreeNode, indent, is_last=False, width=(35, 30)):
result = "{}{}{:<{width1}}{:<{width2}}{}\n".format(
indent,
sym["end"] if is_last else sym["branch"],
node.fgconfig.name,
"[{}]".format(
", ".join([p.fgconfig.name for p in node.parents])
if len(node.parents) > 0
else "ROOT"
),
node.fgconfig.pattern_str,
width1=width[0] - len(indent) - len(sym["branch"]),
width2=width[1],
)
for child in node.children:
_print(child, indent + 2)

for i, child in enumerate(node.children):
_is_last = i == len(node.children) - 1
if is_last:
_indent = indent + sym["empty"]
else:
_indent = indent + "{}".format(sym["skip"])
result += _print(child, _indent, _is_last, width=width)
return result

in_const = len(sym["skip"])
width = (40, 25)
tree_str = "{}{:<{width1}}{:<{width2}}{:<}\n".format(
" " * in_const,
"Functional Group",
"Parents",
"Pattern",
width1=width[0] - in_const,
width2=width[1],
)
for root in roots:
_print(root)
tree_str += _print(root, "", width=width)

result = []
max_l = 0
for line in tree_str.split("\n"):
_line = line[4:]
if len(_line) > max_l:
max_l = len(_line)
result.append(_line)

result.insert(1, "{}".format("-" * max_l))

return "\n".join(result)


def print_tree(roots: list[FGTreeNode]):
print(tree2str(roots))


def build_config_tree_from_list(
Expand Down
5 changes: 3 additions & 2 deletions fgutils/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@ class Parser:
Example for parsing acetic acid::
parser = Parser()
g = parser("CC(O)=O") # Returns graph with 4 nodes and 3 edges
>>> parser = Parser()
>>> g = parser("CC(O)=O")
Graph with 4 nodes and 3 edges
:param use_multigraph:
Expand Down
44 changes: 43 additions & 1 deletion fgutils/proxy.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,18 @@ def __str__(self):

class ProxyGroup:
"""
ProxyGroup is a collection of patterns that can be replaced for a labeled
node in a graph. The node label is the respective group name where one of
the patterns will be replaced.
:param name: The name of the group.
:param graphs: (optional) A list of subgraphs in this group.
:param pattern:
(optional) A list of graph descriptions. The patterns are converted to
ProxyGraphs with one anchor at index 0. If you need more control over
how the subgraphs are inserted use the ``graphs`` argument.
"""

def __init__(
Expand Down Expand Up @@ -121,6 +128,21 @@ def _has_group_nodes(g: nx.Graph, groups: dict[str, ProxyGroup]) -> bool:
def insert_groups(
core: nx.Graph, groups: dict[str, ProxyGroup], parser: Parser
) -> nx.Graph:
"""
Replace labeled nodes in the core graph with groups. For each labeled node
one label is chosen at random and replaced by the identically named group.
This function does not resolve recursive labeled nodes. If a group has
again labeled nodes they will be part of the result graph.
:param core: The parent graph with labeled nodes.
:param groups: A list of groups to replace the labeled nodes in the parent
with.
:param parser: The parser that is used to convert subgraph patterns into
graphs.
:returns: Returns the core graph with replaced nodes.
"""
_core = core.copy()
idx_offset = len(core.nodes)
for anchor, d in _core.nodes(data=True):
Expand All @@ -141,6 +163,20 @@ def insert_groups(
def build_graph(
pattern: str, parser: Parser, groups: dict[str, ProxyGroup] = {}
) -> nx.Graph:
"""
Construct a graph from a pattern and replace all labeled nodes by the
structure defined in the list of groups. This function resolves recursive
labeling. The result graph has no labeled noes as long as a group is given
for each label.
:param pattern: The graph description for the parent graph.
:param parser: The parser to use to convert patterns into structures.
:param groups: A list of groups to replace the labeled nodes in the parent
with.
:returns: Returns the resulting graph with replaced nodes.
"""
core = parser.parse(pattern)
while _has_group_nodes(core, groups):
core = insert_groups(core, groups, parser)
Expand Down Expand Up @@ -216,6 +252,9 @@ def __next__(self):


class ReactionProxy(Proxy):
"""
Proxy to generate reactions.
"""
def __init__(
self,
core: str,
Expand All @@ -230,6 +269,9 @@ def generate(self):


class MolProxy(Proxy):
"""
Proxy to generate molecules.
"""
def __init__(
self,
core: str,
Expand Down
Loading

0 comments on commit ee60c80

Please sign in to comment.