Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend documentation #10

Merged
merged 6 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.python-version
.venv/
.github/
docs/_build/
specification/annotation-schema.md
55 changes: 55 additions & 0 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
############
Contributing
############

This document briefly describes how to contribute to
`mzPAF <https://github.com/hupo-psi/mzPAF>`_.



Before you begin
################

If you have an idea for a feature, use case to add or an approach for a bugfix,
you are welcome to communicate it with the community by opening a
thread in `GitHub Issues <https://github.com/hupo-psi/mzPAF/issues>`_.



Documentation local setup
#########################

To work on the documentation and get a live preview, install the requirements
and run ``sphinx-autobuild``:

.. code-block:: sh

pip install -r ./docs/requirements.txt
sphinx-autobuild ./docs/ ./docs/_build/

Then browse to http://localhost:8000 to watch the live preview.



How to contribute
#################

- Fork `mzPAF <https://github.com/hupo-psi/mzPAF>`_ on GitHub to
make your changes.
- Commit and push your changes to your
`fork <https://help.github.com/articles/pushing-to-a-remote/>`_.
- Ensure that the tests and documentation (both Python docstrings and files in
``/docs/``) have been updated according to your changes. Python
docstrings are formatted in the
`numpydoc style <https://numpydoc.readthedocs.io/en/latest/format.html>`_.
- Open a
`pull request <https://help.github.com/articles/creating-a-pull-request/>`_
with these changes. You pull request message ideally should include:

- A description of why the changes should be made.
- A description of the implementation of the changes.
- A description of how to test the changes.

- The pull request should pass all the continuous integration tests which are
automatically run by
`GitHub Actions <https://github.com/hupo-psi/mzPAF/actions>`_.
116 changes: 108 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,118 @@
# mzPAF Peak Annotation Format

The mzPAF proposed standard is a specification for a fragment ion peak annotation format for mass spectra, focused on peptides. This provides for a standardized format for describing the origin of fragment ions to be used in spectral libraries, other formats that aim to describe fragment ions, and software tools that annotate fragment ions.
## About

The main home page for mzPAF is at the PSI web site: [https://psidev.info/mzPAF](https://psidev.info/mzPAF)
mzPAF is a specification for a fragment ion peak annotation format for mass spectra, focused on
peptides. This provides for a standardized format for describing the origin of fragment ions to be
used in spectral libraries, other formats that aim to describe fragment ions, and software tools
that annotate fragment ions.

# Status
- Official mzPAF homepage: [psidev.info/mzPAF](https://psidev.info/mzPAF)
- mzPAF documentation: [mzpaf.readthedocs.io](https://mzpaf.readthedocs.io)

Updated: 2024-10-15
## Status

The specification has been resubmitted to the PSI Document Process and is undergoing final community review. It is anticipated to become a formal PSI standard near the end of 2024.
_Updated: 2024-10-15_

The specification has been resubmitted to the PSI Document Process and is undergoing final
community review. It is anticipated to become a formal PSI standard near the end of 2024.

# Available Materials
- The current DRAFT specification: [mzPAF_specification_v1.0-draft15.pdf](https://github.com/HUPO-PSI/mzPAF/blob/main/specification/mzPAF_specification_v1.0-draft15.pdf?raw=true)
- Example annotated spectra: [Examples](https://github.com/HUPO-PSI/mzPAF/tree/main/examples)
- The GitHub repo associated with mzPAF: [https://github.com/HUPO-PSI/mzPAF](https://github.com/HUPO-PSI/mzPAF)
- The GitHub repo assocated with the related mzSpecLib standard: [https://github.com/HUPO-PSI/mzSpecLib](https://github.com/HUPO-PSI/mzSpecLib)

## In short

- mzPAF is a single string of characters, case sensitive, without length limit
- Multiple possible explanations are comma-separated
- Deltas of observed – theoretical _m/z_ values are prefixed with a slash (`/`)
- Confidence of annotations are prefixed with an asterisk (`*`)

The basic format of each annotation is:

```
annotation1/delta,annotation2/delta,...
```

or:

```
annotation1/delta*confidence,annotation2/delta*confidence,...
```

For example:

```
b2-H2O/3.2ppm,b4-H2O^2/3.2ppm
```

or:

```
b2-H2O/3.2ppm*0.75,b4-H2O^2/3.2ppm*0.25
```

mzPAF supports:

- Annotations of multiple analytes: `1@y12/0.13,2@b9-NH3/0.23`
- Mass deltas in ppm instead of _m/z_ unit: `y1/-1.4ppm`
- Confidence levels per annotation: `y1/-1.4ppm*0.75`
- Advanced ion notation: `[ion type](neutral loss)(isotope)(adduct type)(charge)`, e.g.: `y4-H2O+2i[M+H+Na]^2`:
- Ion types:
- Peptide ion series (a, b, c, x, y, z): `y4`
- Unknown ions: `?`
- Immonium ions: `IY`
- Internal fragment ions: `m3:6`
- Intact precursor ions: `p^2`
- A set of reference ions: `r[TMT127N]`
- Named compounds: `_{Urocanic Acid}`
- Chemical formulas: `f{C16H22O}`
- Smiles: `s{CN=C=O}[M+H]`
- Embedded ProForma annotations: `0@b2{LC[Carbamidomethyl]}`
- Neutral gains and losses: `y2+CO-H2O`
- Isotopes: `y2+2i`
- Adduct types: `y2[M+H]`
- Charge states: `^2`
- Multiple peaks per annotation: `&y7/-0.001` and `y7/0.000*0.95`

Read the
[full DRAFT specificiation](https://github.com/HUPO-PSI/mzPAF/blob/main/specification/mzPAF_specification_v1.0-draft14.docx?raw=true)
for more details and examples.

## Getting started

### mzPAF in Python

The [mzPAF Python package](https://mzpaf.readthedocs.io/en/latest/implementations/python/) can
parse mzPAF strings into their components, convert to the JSON representation, or serialize back
to an mzPAF string.

```python
>>> import mzpaf
>>> annotations = mzpaf.parse_annotation("b2-H2O/3.2ppm*0.75,b4-H2O^2/3.2ppm*0.25")
>>> print(annotations[0].to_json())
{'neutral_losses': ['-H2O'], 'isotope': 0, 'adducts': [], 'charge': 1, 'analyte_reference': None, 'mass_error': {'value': 3.2, 'unit': 'ppm'}, 'confidence': 0.75, 'molecule_description': {'series_label': 'peptide', 'series': 'b', 'position': 2, 'sequence': None}}
>>> print(anno[0].serialize())
'b2-H2O/3.2ppm*0.75'
```

Learn more at the
[package documentation](https://mzpaf.readthedocs.io/en/latest/implementations/python/).

### mzPAF regular expressions

The mzPAF specification includes regular expressions for parsing mzPAF strings. These can be used
in any programming language that supports regular expressions.

Learn more at the
[mzPAF regex documentation](https://mzpaf.readthedocs.io/en/latest/implementations/regex/).

### mzPAF Lark grammar

mzPAF has also been defined as a
[Lark grammar](https://mzpaf.readthedocs.io/en/latest/implementations/lark/).

### Links

- The mzPAF GitHub repo: [github.com/HUPO-PSI/mzPAF](https://github.com/HUPO-PSI/mzPAF)
- The GitHub repo for the related mzSpecLib standard: [github.com/HUPO-PSI/mzSpecLib](https://github.com/HUPO-PSI/mzSpecLib)
- HUPO-PSI homepage: [psidev.info](https://www.psidev.info/)
1 change: 1 addition & 0 deletions docs/.readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ python:
path: implementations/python
extra_requirements:
- docs
- requirements: docs/requirements.txt
1 change: 1 addition & 0 deletions docs/_static/img/lark-railroad-diagram.svg
RalfG marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
51 changes: 50 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,53 @@
"""Configuration file for the Sphinx documentation builder."""

# Scripts
import json
import shutil
from pathlib import Path

import jsonschema2md
import pandas as pd


def get_jsonschema_docs(input_json, output_markdown):
"""Generate markdown documentation from a JSON schema."""
parser = jsonschema2md.Parser()
with open(input_json, encoding="utf-8") as f_in:
output_md = parser.parse_schema(json.load(f_in))

with open(output_markdown, "w", encoding="utf-8") as f_out:
f_out.writelines(output_md)


def get_reference_molecules_md(input_json, output_markdown):
"""Generate a markdown table of reference molecules."""
df = pd.read_json(input_json).T
buf = df.to_markdown().replace(' nan ', ' ')
with open(output_markdown, 'wt') as fh:
fh.write(buf)


get_jsonschema_docs(
"../specification/annotation-schema.json",
"../specification/annotation-schema.md"
)
get_jsonschema_docs(
"../specification/reference_data/reference_molecule_schema.json",
"../specification/reference_data/reference_molecule_schema.md"
)

get_reference_molecules_md(
"../specification/reference_data/reference_molecules.json",
"../specification/reference_data/reference_molecules.md"
)

if not Path("_static/img/lark-railroad-diagram.svg").exists():
shutil.copy(
"../specification/grammars/schema_images/Annotation.svg",
"_static/img/lark-railroad-diagram.svg"
)


# Project information
project = "mzPAF"
author = "HUPO-PSI"
Expand All @@ -16,7 +64,7 @@
"sphinx_click.ext",
"myst_parser",
]
source_suffix = [".rst"]
source_suffix = [".rst", ".md"]
master_doc = "index"
exclude_patterns = ["_build"]

Expand Down Expand Up @@ -46,6 +94,7 @@
"python": ("https://docs.python.org/3", None),
"psims": ("https://mobiusklein.github.io/psims/docs/build/html/", None),
"pyteomics": ("https://pyteomics.readthedocs.io/en/stable/", None),
"mzspeclib": ("https://mzspeclib.readthedocs.io/en/latest/", None),
}


Expand Down
1 change: 1 addition & 0 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.. include:: ../CONTRIBUTING.rst
36 changes: 36 additions & 0 deletions docs/implementations/json/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
###########
JSON Schema
###########

About
=====

Instead of representing mzPAF as a single string, it can alternatively be expressed as a JSON
object. This format is more compatible for inter-program communication, especially through web
APIs. You can find the JSON schema for mzPAF on GitHub via the following link:

https://raw.githubusercontent.com/HUPO-PSI/mzPAF/main/specification/annotation-schema.json

Replace ``main`` in the URL with the desired version tag to access the schema for a particular
version.

Examples
========

.. literalinclude:: ../../../specification/annotation-example-1.json
:language: json

.. literalinclude:: ../../../specification/annotation-example-2.json
:language: json

.. literalinclude:: ../../../specification/annotation-example-3.json
:language: json



Full schema documentation
=========================

.. include:: ../../../specification/annotation-schema.md
:parser: myst_parser.sphinx_
:start-line: 4
17 changes: 17 additions & 0 deletions docs/implementations/lark/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
############
Lark grammar
############


About
=====

[todo]


Railroad diagram
================

.. figure:: ../../_static/img/lark-railroad-diagram.svg
:alt: Lark grammar

2 changes: 2 additions & 0 deletions docs/implementations/python/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Python API
:imported-members:


.. manually documented as parse_annotation is undocumented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a cross-reference to the manual documentation here:

:meth:`AnnotationStringParser.__call__`

The autodoc sees that it is a function and pulls the signature from AnnotationStringParser.__call__, but nothing else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely following here, but feel free to update


.. autofunction:: parse_annotation

Parse a string into one or more :class:`IonAnnotationBase` instances.
11 changes: 10 additions & 1 deletion docs/implementations/python/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,19 @@
Python implementation
#####################

About
=====

.. include:: ../../../implementations/python/README.md
:parser: myst_parser.sphinx_


Full API documentation
======================

.. toctree::
:caption: Contents
:maxdepth: 2
:glob:

*

25 changes: 25 additions & 0 deletions docs/implementations/regex/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
###################
Regular expressions
###################

mzPAF has been defined in several regular expression dialects.

.. tip::

Regex101.com is a great tool to test regular expressions. Try out the mzPAF regex there:
`regex101.com/r/gDPlJu/1 <https://regex101.com/r/gDPlJu/1>`_.

Python
======

.. literalinclude:: ../../../specification/grammars/regex_sre.py
:language: python
:linenos:


Javascript ECMA
===============

.. literalinclude:: ../../../specification/grammars/regex_ecma.js
:language: javascript
:linenos:
6 changes: 3 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
.. include:: ../README.md
:parser: myst_parser.sphinx_


.. toctree::
:caption: About
:hidden:
:includehidden:
:glob:

Home <self>
implementations/index
specification/index
Specification <specification/index>
Implementations <implementations/index>
Contributing <contributing>
Loading
Loading