You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Water raman scans processing and viz
* Debugging the S3 demo data download
* attempting to migrate from circleCI to github actions
* attempting to migrate from circleCI to github actions
* attempting to migrate from circleCI to github actions
* attempting to migrate from circleCI to github actions
* attempting to migrate from circleCI to github actions
* attempting to migrate from circleCI to github actions
* attempting to migrate from circleCI to github actions
* Playing with github actions. Publish to pypi on release.
* integrating pre-commit and black
* getting the GH action linter working
* GH action for docs
* GH action for docs
* Debugging GH action for docs
* Debugging GH action for docs
* Debugging GH action for docs
* increment minor version for new release
* added some tests for new plotting functions.
* debugging codecov GH action.
* debugging codecov GH action.
* debugging codecov GH action.
* debugging codecov GH action.
* Update README
* JOSS paper prep.
* Added the MIT REMORA instrument and fixed minor bugs.
title: 'PyEEM: A Python library for the preprocessing, correction, deconvolution and analysis of Excitation Emission Matrices (EEMs).'
2
+
title: 'PyEEM: A Python library for the preprocessing, correction, and analysis of Excitation Emission Matrices (EEMs).'
3
3
tags:
4
4
- python
5
5
- fluorescence
@@ -9,10 +9,12 @@ tags:
9
9
authors:
10
10
- name: Drew Meyers
11
11
affiliation: "1, 2"
12
+
- name: Jay W Rutherford
13
+
affiliation: 3
12
14
- name: Qinmin Zheng
13
15
affiliation: 2
14
16
- name: Fabio Duarte
15
-
affiliation: "2, 3"
17
+
affiliation: "2, 4"
16
18
- name: Carlo Ratti
17
19
affiliation: 2
18
20
- name: Harold H Hemond
@@ -24,42 +26,24 @@ affiliations:
24
26
index: 1
25
27
- name: Senseable City Lab, Massachusetts Institute of Technology
26
28
index: 2
27
-
- name: Pontifícia Universidade Católica do Paraná, Brazil
29
+
- name: Department of Chemical Engineering, University of Washington
28
30
index: 3
31
+
- name: Pontifícia Universidade Católica do Paraná, Brazil
32
+
index: 4
29
33
date: 2020-07-08
30
34
bibliography: paper.bib
31
35
---
32
36
33
-
# Statement of Need
34
-
35
-
Fluorescence Excitation and Emission Matrix Spectroscopy (EEMs) is a popular analytical technique in environmental monitoring. In particular, it has been applied extensively to investigate the composition and concentration of dissolved organic matter (DOM) in aquatic systems [@Coble1990;@McKnight2001;@Fellman2010]. Historically, EEMs have been combined with multi-way techniques such as PCA, ICA, and PARAFAC in order to decompose chemical mixtures [@Bro1997;@Stedmon2008;@Murphy2013;@CostaPereira2018]. More recently, machine learning approaches such as convolutional neural networks (CNNs) and autoencoders have been applied to EEMs for source sepearation of chemical mixtures [@Cuss2016;@Peleato2018;@Ju2019;@Rutherford2020]. However, before these source separation techniques can be performed, several preprocessing and correction steps must be applied to the raw EEMs. In order to achieve comparability between studies, standard methods to apply these corrections have been developed [@Ohno2002;@Bahram2006;@Lawaetz2009;@R.Murphy2010;@Murphy2011;@Kothawala2013]. These standard methods have been implemented in Matlab and R packages [@Murphy2013;@Massicotte;Pucher2019]. However until PyEEM, no Python package existed which implemented these standard correction steps. Furthermore, the Matlab and R implementations impose metadata schemas on users which limit their ability to track several important metrics corresponding with each measurement set. By providing a Python implementation, researchers will now be able to more effectively leverage Python's large scienfitic computing ecosystem when working with EEMs.
36
-
37
-
In addition to the implementation of the preprocessing and correction steps, PyEEM also provides researchers with the ability to create augmented mixture and single source training data from a small set of calibration EEM measurements. The augmentation technique relies on the fact that fluorescnce spectra are linearly additive in mixtures, according to Beer's law [source]. This augmentation technique was first described in Rutherford et al., in which it was used to train a CNN to predict the concentration of single sources of pollutants in spectral mixtures [@Rutherford2020]. Additionally, augmented and synthetic data has shown promise in improving the performace of deep learning models in several fields [@Nikolenko2019].
38
-
39
-
PyEEM provides the first open source implementation of such an augmentation technique for EEMs. PyEEM also provides plots toolbox useful in the interpretation of EEMs... [@Hansen2018]
40
-
41
37
# Summary
42
38
43
-
- A summary describing the high-level functionality and purpose of the software for a diverse, non-specialist audience...
44
-
- Description of how the software enables some new research challenges to be addressed or makes addressing research challenges significantly better (e.g., faster, easier, simpler)...
45
-
- Description of how the software is feature-complete (i.e. no half-baked solutions) and designed for maintainable extension (not one-off modifications of existing tools)...
39
+
Fluorescence Excitation and Emission Matrix Spectroscopy (EEMs) is a popular analytical technique in environmental monitoring. In particular, it has been applied extensively to investigate the composition and concentration of dissolved organic matter (DOM) in aquatic systems [@Coble1990;@McKnight2001;@Fellman2010]. Historically, EEMs have been combined with multi-way techniques such as PCA, ICA, and PARAFAC in order to decompose chemical mixtures [@Bro1997;@Stedmon2008;@Murphy2013;@CostaPereira2018]. More recently, deep learning approaches such as convolutional neural networks (CNNs) and autoencoders have been applied to EEMs for source separation of chemical mixtures [@Cuss2016;@Peleato2018;@Ju2019;@Rutherford2020]. However, before these source separation techniques can be performed, several preprocessing and correction steps must be applied to the raw EEMs. In order to achieve comparability between studies, standard methods to apply these corrections have been developed [@Ohno2002;@Bahram2006;@Lawaetz2009;@R.Murphy2010;@Murphy2011;@Kothawala2013]. PyEEM provides a Python implementation for these standard preprocessing and correction steps for EEM measurements produced by several common spectrofluorometers.
46
40
47
-
PyEEM is a python library for the preprocessing, correction, deconvolution and analysis of Excitation Emission Matrices (EEMs)...
41
+
In addition to the implementation of the standard preprocessing and correction steps, PyEEM also provides researchers with the ability to create augmented single source and mixture training data from a small set of calibration EEM measurements. The augmentation technique relies on the fact that fluorescence spectra are linearly additive in mixtures, according to Beer's law. This augmentation technique was first described in Rutherford et al., in which it was used to train a CNN to predict the concentration of single sources of pollutants in spectral mixtures [@Rutherford2020]. Additionally, augmented and synthetic data has shown promise in improving the performance of deep learning models in several fields [@Nikolenko2019].
Finally, PyEEM provides an extensive visualization toolbox, based on Matplotlib, which is useful in the interpretation of EEM datasets. This visualization toolbox includes various ways of plotting EEMs, the visualization of the Raman scatter peak area over time, and more.
60
44
61
-
# Acknowledgements
45
+
# Statement of Need
62
46
63
-
We acknowledge contributions from...
47
+
Prior to PyEEM, no open source Python package existed to work with EEMs. However, such libraries have existed for MATLAB and R for some time [@Murphy2013;@Massicotte;Pucher2019]. By providing a Python implementation, researchers will now be able to more effectively leverage Python's large scientific computing ecosystem when working with EEMs. Furthermore, the existing libraries in MATLAB and R do not provide deep learning techniques for decomposing chemical mixtures from EEMs. These libraries provide PARAFAC methods for performing such a task. However, although this technique has been widely used for some time, it has its limitations and recent work has shown promise in using deep learning approaches. For this reason, PyEEM provides a toolbox for generating augmented training data as well as an implementation of the CNN architecture reported in Rutherford et al., which has shown to be able to successfully decompose spectral mixtures [@Rutherford2020].
0 commit comments