Skip to content

Commit

Permalink
add batches to README and bump version
Browse files Browse the repository at this point in the history
  • Loading branch information
Y0dler authored Dec 4, 2023
1 parent 0b2e8d9 commit 0b3ce3f
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
[![PyPI version](https://img.shields.io/pypi/v/bletl)](https://pypi.org/project/peak-performance/)
[![pipeline](https://github.com/jubiotech/bletl/workflows/pipeline/badge.svg)](https://github.com/JuBiotech/peak-performance/actions)
[![coverage](https://codecov.io/gh/jubiotech/bletl/branch/main/graph/badge.svg)](https://app.codecov.io/gh/JuBiotech/peak-performance)

# How to use PeakPerformance
For installation instructions, see `Installation.md`.
For instructions regarding the use of PeakPerformance, check out the example notebook(s) under `notebooks`, the complementary example data under `example`, and the following introductory explanations.
Expand All @@ -11,18 +15,18 @@ from pathlib import Path
time_series = np.array([np.array(time), np.array(intensity)])
np.save(Path(r"example_path/time_series.npy"), time_series)
```
The naming convention of raw data files is "<acquisition name>_<precursor ion m/z or experiment number>_<product ion m/z start>_<product ion m/z end>.npy". There should be no underscores within the named sections such as `acquisition name`. Essentially, the raw data names include the acquisition and mass trace, thus yielding a recognizable and unique name for each isotopomer/fragment/metabolite/sample.
The naming convention of raw data files is `<acquisition name>_<precursor ion m/z or experiment number>_<product ion m/z start>_<product ion m/z end>.npy`. There should be no underscores within the named sections such as `acquisition name`. Essentially, the raw data names include the acquisition and mass trace, thus yielding a recognizable and unique name for each isotopomer/fragment/metabolite/sample.

## Model selection
When it comes to selecting models, PeakPerformance has a function performing an automated selection process by analyzing one acquisiton per mass trace with all implemented models. Subsequently, all models are ranked based on an information criterion (either pareto-smoothed importance sampling leave-one-out cross-validation or widely applicable information criterion). For this process to work as intended, you need to specify acquisitions with representative peaks for each mass trace (see example notebook 1). If e.g. most peaks of an analyte show a skewed shape, then select an acquisition where this is the case. For double peaks, select an acquision where the peaks are as distinct and comparable in height as possible.
When it comes to selecting models, PeakPerformance has a function performing an automated selection process by analyzing one acquisiton per mass trace with all implemented models. Subsequently, all models are ranked based on an information criterion (either pareto-smoothed importance sampling leave-one-out cross-validation or widely applicable information criterion). For this process to work as intended, you need to specify acquisitions with representative peaks for each mass trace (see example notebook 1). If e.g. most peaks of an analyte show a skewed shape, then select an acquisition where this is the case. For double peaks, select an acquision where the peaks are as distinct and comparable in height as possible.
Since model selection is a computationally demanding and time consuming process, it is suggested to state the model type as the user (see example notebook 1) if possible.

## Troubleshooting
### A batch run broke and I want to restart it.
If an error occured in the middle of a batch run, then you can use the `pipeline_restart` function in the `pipeline` module to create a new batch which will analyze only those samples, which have not been analyzed previously.

### The model parameters don't converge and/or the fit does not describe the raw data well.
Due to the vast number of LC-MS methods out there, it is probably not possible to formulate befitting prior probability distributions (priors) for all of them. Therefore, one of the first things should be to check in `models.py` whether the model priors make sense for your application and change them according to your data in case they don't. Also, make sure the time series containing the signal to be analyzed contains the peak or double peak (preferrably in the center) and a) no other peaks as well as b) an area around the peak for estimating the baseline (a window size of roughly 5 times the peak width should be fine).
Check the separate file `How to adapt PeakPerformance to you data`.

# How to contribute
If you encounter bugs while using PeakPerformance, please bring them to our attention by opening an issue. When doing so, describe the problem in detail and add screenshots/code snippets and whatever other helpful material you can provide.
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "peak_performance"
version = "0.6.3"
version = "0.6.4"
authors = [
{name = "Jochen Nießer", email = "j.niesser@fz-juelich.de"},
{name = "Michael Osthege", email = "m.osthege@fz-juelich.de"},
Expand Down

0 comments on commit 0b3ce3f

Please sign in to comment.