Skip to content

Commit

Permalink
Merge pull request #39 from MannLabs/development
Browse files Browse the repository at this point in the history
Enhance input validation, CLI functionality, and testing for directLFQ
  • Loading branch information
ammarcsj authored Aug 13, 2024
2 parents c326a3c + 9e2f7f1 commit 62fce98
Show file tree
Hide file tree
Showing 22 changed files with 153 additions and 1,101 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/build_windows_package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Master version bumped
id: master_version_bumped
shell: bash -l {0}
Expand Down Expand Up @@ -52,6 +53,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/nbdev_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/publish_and_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Master version bumped
id: master_version_bumped
shell: bash -l {0}
Expand Down Expand Up @@ -51,6 +52,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down Expand Up @@ -83,6 +85,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down Expand Up @@ -115,6 +118,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down Expand Up @@ -150,6 +154,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down Expand Up @@ -185,6 +190,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/quick_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand All @@ -45,6 +46,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/quick_tests_ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/unused/all_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ jobs:
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
2 changes: 1 addition & 1 deletion directlfq/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@


__project__ = "directlfq"
__version__ = "0.2.19"
__version__ = "0.2.20"
__license__ = "Apache"
__description__ = "An open-source Python package of the AlphaPept ecosystem"
__author__ = "Mann Labs"
Expand Down
2 changes: 1 addition & 1 deletion directlfq/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ def gui():
@click.option("--filename_suffix", "-fs", type=str, default="", help="A suffix to add to the output file name.")
@click.option("--num_cores", "-nc", type = int, default = None, help="The number of cores to use (default is to use multiprocessing).")
@click.option("--deactivate_normalization", "-dn", type = bool, default = False, help="If you want to deactivate the normalization step, you can set this flag to True.")
@click.option("--filter_dict", "-dn", type = bool, default = False, help="In case you want to define specific filters in addition to the standard filters, you can add a yaml file where the filters are defined (see GitHub docu for example).")
@click.option("--filter_dict", "-fd", type = str, default = None, help="In case you want to define specific filters in addition to the standard filters, you can add a yaml file where the filters are defined (see GitHub docu for example).")

def run_directlfq(**kwargs):
print("starting directLFQ")
Expand Down
1 change: 1 addition & 0 deletions directlfq/lfq_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ def run_lfq(input_file, columns_to_add = [], selected_proteins_file :str = None
input_df = lfqutils.import_data(input_file=input_file, input_type_to_use=input_type_to_use, filter_dict=filter_dict)

input_df = lfqutils.sort_input_df_by_protein_id(input_df)
input_df = lfqutils.remove_potential_quant_id_duplicates(input_df)
input_df = lfqutils.index_and_log_transform_input_df(input_df)
input_df = lfqutils.remove_allnan_rows_input_df(input_df)

Expand Down
7 changes: 7 additions & 0 deletions directlfq/normalization.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,10 +259,15 @@ def __init__(self, complete_dataframe, num_samples_quadratic = 100):
self.normalization_function = None

def _run_normalization(self):
self._check_that_there_are_no_duplicate_rows()
if len(self.complete_dataframe.index) <= self._num_samples_quadratic:
self._normalize_complete_input_quadratic()
else:
self._normalize_quadratic_and_linear()

def _check_that_there_are_no_duplicate_rows(self):
if self.complete_dataframe.index.duplicated().any():
raise ValueError("There are duplicate rows in the input dataframe. Ensure that there are no duplicate quant_id/ion values.")

def _normalize_complete_input_quadratic(self):
self.complete_dataframe = self.normalization_function(self.complete_dataframe)
Expand Down Expand Up @@ -295,6 +300,8 @@ def _shift_remaining_dataframe_to_reference_sample(self):
linear_shifted_dataframe = SampleShifterLinear(linear_subset_dataframe, self._merged_reference_sample).ion_dataframe
self.complete_dataframe.loc[ self._linear_subset_rows, :] = linear_shifted_dataframe



@staticmethod
@njit
def _get_num_nas_in_row(row):
Expand Down
24 changes: 24 additions & 0 deletions directlfq/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,30 @@ def index_and_log_transform_input_df(data_df):
def remove_allnan_rows_input_df(data_df):
return data_df.dropna(axis = 0, how = 'all')

def remove_potential_quant_id_duplicates(data_df : pd.DataFrame):
"""
Remove duplicate entries from a DataFrame based on the QUANT_ID column.
This function removes duplicate rows from the input DataFrame, keeping only the first
occurrence of each unique QUANT_ID. It also logs a warning message if any duplicates
are found and removed.
Args:
data_df (pd.DataFrame): dataframe in directLFQ format
Returns:
pd.DataFrame: dataframe in directLFQ format w duplicate QUANT_ID entries removed.
"""
before_drop = len(data_df)
data_df = data_df.drop_duplicates(subset=config.QUANT_ID, keep='first')
after_drop = len(data_df)
if before_drop != after_drop:
entries_removed = before_drop - after_drop
LOGGER.warning(f"Duplicate quant_ids detected. {entries_removed} rows removed from input df.")

return data_df


def sort_input_df_by_protein_id(data_df):
return data_df.sort_values(by = config.PROTEIN_ID,ignore_index=True)

Expand Down
2 changes: 1 addition & 1 deletion misc/bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.2.19
current_version = 0.2.20
commit = True
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+)(?P<build>\d+))?
Expand Down
1,182 changes: 94 additions & 1,088 deletions nbdev_nbs/02_normalization.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion release/one_click_linux_gui/control
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Package: directlfq
Version: 0.2.19
Version: 0.2.20
Architecture: all
Maintainer: Mann Labs <opensource@alphapept.com>
Description: directlfq
Expand Down
2 changes: 1 addition & 1 deletion release/one_click_linux_gui/create_installer_linux.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ python setup.py sdist bdist_wheel
# Setting up the local package
cd release/one_click_linux_gui
# Make sure you include the required extra packages and always use the stable or very-stable options!
pip install "../../dist/directlfq-0.2.19-py3-none-any.whl[stable, gui]"
pip install "../../dist/directlfq-0.2.20-py3-none-any.whl[stable, gui]"

# Creating the stand-alone pyinstaller folder
pip install pyinstaller==4.10
Expand Down
4 changes: 2 additions & 2 deletions release/one_click_macos_gui/Info.plist
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@
<key>CFBundleIconFile</key>
<string>alpha_logo.icns</string>
<key>CFBundleIdentifier</key>
<string>directlfq.0.2.19</string>
<string>directlfq.0.2.20</string>
<key>CFBundleShortVersionString</key>
<string>0.2.19</string>
<string>0.2.20</string>
<key>CFBundleInfoDictionaryVersion</key>
<string>6.0</string>
<key>CFBundleName</key>
Expand Down
4 changes: 2 additions & 2 deletions release/one_click_macos_gui/create_installer_macos.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ python setup.py sdist bdist_wheel

# Setting up the local package
cd release/one_click_macos_gui
pip install "../../dist/directlfq-0.2.19-py3-none-any.whl[stable, gui]"
pip install "../../dist/directlfq-0.2.20-py3-none-any.whl[stable, gui]"

# Creating the stand-alone pyinstaller folder
pip install pyinstaller==4.10
Expand All @@ -40,5 +40,5 @@ cp ../../LICENSE Resources/LICENSE
cp ../logos/alpha_logo.png Resources/alpha_logo.png
chmod 777 scripts/*

pkgbuild --root dist/directlfq --identifier de.mpg.biochem.directlfq.app --version 0.2.19 --install-location /Applications/directlfq.app --scripts scripts directlfq.pkg
pkgbuild --root dist/directlfq --identifier de.mpg.biochem.directlfq.app --version 0.2.20 --install-location /Applications/directlfq.app --scripts scripts directlfq.pkg
productbuild --distribution distribution.xml --resources Resources --package-path directlfq.pkg dist/directlfq_gui_installer_macos.pkg
2 changes: 1 addition & 1 deletion release/one_click_macos_gui/distribution.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<installer-script minSpecVersion="1.000000">
<title>directlfq 0.2.19</title>
<title>directlfq 0.2.20</title>
<background mime-type="image/png" file="alpha_logo.png" scaling="proportional"/>
<welcome file="welcome.html" mime-type="text/html" />
<conclusion file="conclusion.html" mime-type="text/html" />
Expand Down
2 changes: 1 addition & 1 deletion release/one_click_windows_gui/create_installer_windows.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ python setup.py sdist bdist_wheel
# Setting up the local package
cd release/one_click_windows_gui
# Make sure you include the required extra packages and always use the stable or very-stable options!
pip install "../../dist/directlfq-0.2.19-py3-none-any.whl[stable, gui]"
pip install "../../dist/directlfq-0.2.20-py3-none-any.whl[stable, gui]"

# Creating the stand-alone pyinstaller folder
pip install pyinstaller==4.10
Expand Down
2 changes: 1 addition & 1 deletion release/one_click_windows_gui/directlfq_innoinstaller.iss
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
; SEE THE DOCUMENTATION FOR DETAILS ON CREATING INNO SETUP SCRIPT FILES!

#define MyAppName "directlfq"
#define MyAppVersion "0.2.19"
#define MyAppVersion "0.2.20"
#define MyAppPublisher "Max Planck Institute of Biochemistry and the University of Copenhagen, Mann Labs"
#define MyAppURL "https://github.com/MannLabs/directlfq"
#define MyAppExeName "directlfq_gui.exe"
Expand Down
2 changes: 1 addition & 1 deletion settings.ini
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ author = Constantin Ammar
author_email = constantin.ammar@gmail.com
copyright = fast.ai
branch = master
version = 0.2.19
version = 0.2.20
min_python = 3.6
audience = Developers
language = English
Expand Down
1 change: 1 addition & 0 deletions tests/run_quicktests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,6 @@ python download_testfiles.py quicktest
cd quicktests
jupyter nbconvert --to script run_pipeline_w_different_input_formats.ipynb
python run_pipeline_w_different_input_formats.py
directlfq lfq -i ../../test_data/system_tests/quicktests/diann/shortened_input.tsv
cd ..
conda deactivate

0 comments on commit 62fce98

Please sign in to comment.