Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
nielshulstaert committed Feb 13, 2020
2 parents c8a1b13 + b6897a7 commit 862e14d
Show file tree
Hide file tree
Showing 20 changed files with 328 additions and 89 deletions.
17 changes: 13 additions & 4 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,35 +6,44 @@ on:
- 'v*'

jobs:
deploy:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1

- name: Set up Python
uses: actions/setup-python@v1
with:
python-version: '3.7'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
- name: Copy models to GUI directory
run: |
cp -r deeplc/mods deeplc_gui
- name: Zip GUI directory
uses: thedoctor0/zip-release@master
with:
filename: 'deeplc_gui.zip'
exclusions: '/*src/*'
path: 'deeplc_gui/*'
- name: GitHub Release

- name: Create GitHub Release
uses: docker://antonyurchenko/git-release:v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DRAFT_RELEASE: "true"
DRAFT_RELEASE: "false"
PRE_RELEASE: "false"
CHANGELOG_FILE: "CHANGELOG.md"
with:
args: |
deeplc_gui.zip
- name: Build and publish
- name: Build and publish to PyPI
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/python_package_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Python package test
on: [push, pull_request]

jobs:
build:
test:
runs-on: ${{ matrix.os }}
strategy:
max-parallel: 4
Expand Down
23 changes: 22 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,28 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to
[Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
## [0.1.11] - 2020-02-13
- Fixes in GUI

## [0.1.10] - 2020-02-10
- Include less models in package to meet PyPI 60MB size limitation

## [0.1.9] - 2020-02-09
- Bugfix: Pass custom activation function

## [0.1.8] - 2020-02-07
- Fixed support for averaging predictions of groups of models (ensemble) when no models were passed
- New models for ensemble

## [0.1.7] - 2020-02-07
- Support for averaging predictions of groups of models (ensemble)

## [0.1.6] - 2020-01-21
- Fix the latest release

## [0.1.5] - 2020-01-21
- Spaces in paths to files and installation allowed
- References to other CompOmics tools removed in GUI

## [0.1.5] - 2020-02-13
- Fixes in GUI
Expand Down
27 changes: 11 additions & 16 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Contributing

This document briefly describes how to contribute to
[DeepLC](https://github.com/HUPO-PSI/SpectralLibraryFormat).
[DeepLC](https://github.com/compomics/DeepLC).

## Before you begin

If you have an idea for a feature, use case to add or an approach for a bugfix,
it is best to communicate with the community by creating an issue in
[GitHub issues](https://github.com/HUPO-PSI/SpectralLibraryFormat/issues).
[GitHub issues](https://github.com/compomics/DeepLC/issues).

## How to contribute

- Fork [DeepLC](https://github.com/HUPO-PSI/SpectralLibraryFormat) on GitHub to
- Fork [DeepLC](https://github.com/compomics/DeepLC) on GitHub to
make your changes.
- Commit and push your changes to your
[fork](https://help.github.com/articles/pushing-to-a-remote/).
Expand All @@ -28,28 +28,23 @@ with these changes. You pull request message ideally should include:

## Development workflow

- Main development happens on the `master` branch.

- When a new version is ready to be published:

1. Merge into the `releases` branch.
2. Change the version number in `setup.py` using
1. Change the version number in `setup.py` using
[semantic versioning](https://semver.org/).
3. Update the changelog (if not already done) in `CHANGELOG.md` according to
2. Update the changelog (if not already done) in `CHANGELOG.md` according to
[Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
4. Set a new tag with the version number, e.g. `git tag 0.1.1.dev1`.
5. Push to GitHub, with the tag: `git push; git push --tags`.
6. Update the version and sha256 checksum in the bioconda recipe using
`conda skeleton pypi deeplc` in the
[bioconda-recipes](https://github.com/bioconda/bioconda-recipes) repository.
3. Set a new tag with the version number, e.g. `git tag v0.1.5`.
4. Push to GitHub, with the tag: `git push; git push --tags`.

- When new commits are pushed to the `releases` branch, the following GitHub
Actions are triggered:
- When a new tag is pushed to (or made on) GitHub that matches `v*`, the
following GitHub Actions are triggered:

1. The Python package is build and published to PyPI.
2. A zip archive is made of the `./deeplc_gui/` directory, excluding
`./deeplc_gui/src` with
[Zip Release](https://github.com/marketplace/actions/zip-release).
3. A GitHub release is made with the zipped GUI files as asset and the new
3. A GitHub release is made with the zipped GUI files as assets and the new
changes listed in `CHANGELOG.md` with
[Git Release](https://github.com/marketplace/actions/git-release).
4. After some time, the bioconda package should get updated automatically.
115 changes: 115 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ DeepLC: Retention time prediction for (modified) peptides using Deep Learning.
- [Python module](#python-module)
- [Input files](#input-files)
- [Prediction models](#prediction-models)
- [Q&A](#qa)

---

Expand Down Expand Up @@ -145,3 +146,117 @@ settings:
By default, DeepLC selects the best model based on the calibration dataset. If
no calibration is performed, the first default model is selected. Always keep
note of the used models and the DeepLC version.

## Q&A

**__Q: So DeepLC is able to predict the retention time for any modification?__**

Yes, DeepLC can predict the retention time of any modification. However, if the
modification is **very** different from the peptides the model has seen during
training the accuracy might not be satisfactory for you. For example, if the model
has never seen a phosphor atom before, the accuracy of the prediction is going to
be low.

**__Q: Installation fails. Why?__**

Please make sure to install DeepLC in a path that does not contain spaces. Run
the latest LTS version of Ubuntu or Windows 10. Make sure you have enough disk
space available, surprisingly TensorFlow needs quite a bit of disk space. If
you are still not able to install DeepLC, please feel free to contact us:

Robbin.Bouwmeester@ugent.be and Ralf.Gabriels@ugent.be

**__Q: I have a special usecase that is not supported. Can you help?__**

Ofcourse, please feel free to contact us:

Robbin.Bouwmeester@ugent.be and Ralf.Gabriels@ugent.be

**__Q: DeepLC runs out of memory. What can I do?__**

You can try to reduce the batch size. DeepLC should be able to run if the batch size is low
enough, even on machines with only 4 GB of RAM.

**__Q: I have a graphics card, but DeepLC is not using the GPU. Why?__**

For now DeepLC defaults to the CPU instead of the GPU. Clearly, because you want
to use the GPU, you are a power user :-). If you want to make the most of that expensive
GPU, you need to change or remove the following line (at the top) in __deeplc.py__:

```
# Set to force CPU calculations
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
```

Also change the same line in the function __reset_keras()__:

```
# Set to force CPU calculations
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
```

Either remove the line or change to (where the number indicates the number of GPUs):

```
# Set to force CPU calculations
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
```

**__Q: What modification name should I use?__**

The names from unimod are used. The PSI-MS name is used by default, but the Interim name
is used as a fall-back if the PSI-MS name is not available. Please also see __unimod_to_formula.csv__
in the folder __unimod/__ for the naming of specific modifications.

**__Q: I have a modification that is not in unimod. How can I add the modification?__**

In the folder __unimod/__ there is the file __unimod_to_formula.csv__ that can be used to
add modifications. In the CSV file add a name (**that is unique and not present yet**) and
the change in atomic composition. For example:

```
Met->Hse,O,H(-2) C(-1) S(-1)
```

Make sure to use negative signs for the atoms subtracted.

**__Q: Help, all my predictions are between [0,10]. Why?__**

It is likely you did not use calibration. No problem, but the retention times for training
purposes were normalized between [0,10]. This means that you probably need to adjust the
retention time yourselve after analysis or use a calibration set as the input.

**__Q: How does the ensemble part of DeepLC work?__**

Models within the same directory are grouped if they overlap in their name. The overlap
has to be in their full name, except for the last part of the name after a "_"-character.

The following models will be grouped:

```
full_hc_dia_fixed_mods_a.hdf5
full_hc_dia_fixed_mods_b.hdf5
```

None of the following models will not be grouped:

```
full_hc_dia_fixed_mods2_a.hdf5
full_hc_dia_fixed_mods_b.hdf5
full_hc_dia_fixed_mods_2_b.hdf5
```

**__Q: I would like to take the ensemble average of multiple models, even if they are trained on different datasets. How can I do this?__**

Feel free to experiment! Models within the same directory are grouped if they overlap in
their name. The overlap has to be in their full name, except for the last part of the
name after a "_"-character.

The following models will be grouped:

```
model_dataset1.hdf5
model_dataset2.hdf5
```

So you just need to rename you models.
22 changes: 22 additions & 0 deletions deeplc/__main__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
"""
Code used to run the retention time predictor
"""

__author__ = ["Robbin Bouwmeester", "Ralf Gabriels"]
__credits__ = ["Robbin Bouwmeester", "Ralf Gabriels", "Prof. Lennart Martens", "Sven Degroeve"]
__license__ = "Apache License, Version 2.0"
__maintainer__ = ["Robbin Bouwmeester", "Ralf Gabriels"]
__email__ = ["Robbin.Bouwmeester@ugent.be", "Ralf.Gabriels@ugent.be"]

# Standard library
from collections import Counter
Expand Down Expand Up @@ -220,12 +225,29 @@ def run(file_pred="",

logging.info("Using DeepLC version %s", __version__)

if len(file_cal) == 0 and file_model != None:
fm_dict = {}
sel_group = ""
for fm in file_model:
if len(sel_group) == 0:
sel_group = "_".join(fm.split("_")[:-1])
fm_dict[sel_group]= fm
continue
m_group = "_".join(fm.split("_")[:-1])
if m_group == sel_group:
fm_dict[m_group] = fm
file_model = fm_dict

# Read input files
df_pred = pd.read_csv(file_pred)
if len(df_pred.columns) < 2:
df_pred = pd.read_csv(file_pred,sep=" ")
df_pred = df_pred.fillna("")

if len(file_cal) > 1:
df_cal = pd.read_csv(file_cal)
if len(df_cal.columns) < 2:
df_cal = pd.read_csv(df_cal,sep=" ")
df_cal = df_cal.fillna("")

# Make a feature extraction object; you can skip this if you do not want to
Expand Down
Loading

0 comments on commit 862e14d

Please sign in to comment.