Updates and fixes #709

jorainer · 2023-12-20T08:59:39Z

Improve performance of the chromatogram call for XcmsExperiment objects.
Remove internal (not exported) normalization functions. These have been transferred to the MetaboCoreUtils package (Add linear model-based normalization functions rformassspectrometry/MetaboCoreUtils#78).
Support subsetting of XcmsExperiment with negative indices (negative subsetting removes chromPeaks #707).

- Enable support for subsetting `XcmsExperiment` objects with negative indices (issue #707).

- Remove the not-exported normalization functions. These have been moved to the MetaboCoreUtils package.

- Improve the performance to extract EICs (along with chromatographic peaks) from an `XcmsExperiment` object.

jorainer · 2023-12-20T09:02:14Z

R/XcmsExperiment-functions.R

+            object, rt = fd[i, rtc], mz = fd[i, mzc],
+            msLevel = chrs[i, 1]@msLevel, type = chromPeaks)
+        f_s <- factor(.chromPeaks(object)[idx, "sample"], levels = js)
+        pkl <- split.data.frame(.chromPeaks(object)[idx, , drop = FALSE], f_s)


splitting the chrom peaks matrix and chrom peak data data.frame considerably improved performance. Also, we try to use data.frame instead of DataFrame for chromPeakData as much as possible: subset of the former is much faster.

Can I ask the reason of why the DataFrame was initially used ? was it easier at the beginning ? I have not really end up using DataFrame until now so I'm a bit curious as to when someone decide to prefer this over data.frame

Good question. There were 2 reasons: firstly, the DataFrame was used in e.g. MSnbase and other Bioconductor packages (e.g. SummarizedExperiment) and I wanted to be consistent with that, then, secondly, the DataFrame would also allow to store S4 objects in columns (which the data.frame would officially not). Not that we planned to store now S4 objects in DataFrames - so it was mostly reason 1 why we used DataFrame.

Only (quite some time) later I figured out that subset or extract data from a DataFrames is quite slower than for data.frame. Thus, recently, I prefer to use, whenever possible (internally) a data.frame.

jorainer · 2023-12-20T09:03:28Z

R/XcmsExperiment-functions.R

    for (i in seq_len(nrow(chrs))) {
-        pks <- chromPeaks(object, rt = fd[i, rtc], mz = fd[i, mzc],
-                          msLevel = chrs[i, 1]@msLevel, type = chromPeaks)
+        idx <- .index_chrom_peaks(


instead of getting the filtered chromPeaks matrix we simply get the indices of the rows matching the filter. That allows to more efficiently subset also the chromPeakData.

jorainer · 2023-12-20T09:03:55Z

R/XcmsExperiment-functions.R

+#'     requested filtering.
+#'
+#' @noRd
+.index_chrom_peaks <- function(object, rt = numeric(),


this function is then also re-used in chromPeaks,XcmsExperiment and chromPeaks,XCMSnExp.

jorainer · 2023-12-20T09:04:31Z

R/XcmsExperiment-functions.R

+#' a `XcmsExperiment` object.
+#'
+#' @noRd
+.chromPeakData <- function(object, msLevel = integer()) {


Little helper functions to avoid unnecessary conversion of data.frame to DataFrame for internal calls.

jorainer · 2023-12-20T09:06:38Z

R/XcmsExperiment-functions.R

+#' filtering from either a `XcmsExperiment` or `XCMSnExp` object
+#'
+#' @noRd
+.chromPeaks <- function(object) {


That function was required, because we would end up in (infinitely) recursive call with the chromPeaks method: chromPeaks uses .index_chrom_peaks, which in turn calls chromPeaks (that again calls .index_chrom_peaks, ...). To extract the full chrom peaks matrix we can use .chromPeaks, to get eventually filtered chrom peaks we use chromPeaks (that's what the user will use).

jorainer · 2023-12-20T09:10:28Z

R/XcmsExperiment-functions.R

-#' but there could be room for improvement.
+#' from an XcmsExperiment.
+#'
+#' @param object `XcmsExperiment` or `XCMSnExp` object.
 #'
 #' @noRd
 .xmse_extract_chromatograms_old <- function(object, rt, mz, aggregationFun,


this function was modified to improve performance of the chromatogram,XcmsExperiment call.

jorainer · 2023-12-20T09:12:54Z

Note: unit tests fail on ubuntu (and the Bioconductor docker image) because of internal changes in R devel. Because of this Rcpp, rlang and related packages fail to install (and hence also mzR, xcms etc).

jorainer · 2024-01-11T13:00:55Z

@sneumann @philouail , just a friendly reminder to eventually (if you find the time) have a look at this PR :)

sneumann · 2024-01-11T16:34:22Z

Hi, any need to bump dependency versions if code was moved to MetaboCoreUtils ? Otherwise looks good to me. Yours, Steffen

jorainer · 2024-01-12T09:49:00Z

Nope, no need for any version bump - this internal code that was refactored and moved to MetaboCoreUtils was never used within xcms (in any exported function).

sneumann

So if no changes / version bumps in DESCCRIPTION are needed, this looks good to me. Yours, Steffen

philouail · 2024-01-12T15:13:02Z

R/XcmsExperiment.R

-        pks_empty <- chromPeaks(object)[integer(), ]
-        pkd_empty <- chromPeakData(object)[integer(), ]
+        pks_empty <- .chromPeaks(object)[integer(), ]
+        pkd_empty <- as(.chromPeakData(object)[integer(), ], "DataFrame")


What is happening here ?

I created new internal functions .chromPeaks and .chromPeakData to extract the content of the chromPeaks (or chromPeakData) slot from an xcms result object. We have two xcms result objects now, XCMSnExp and XcmsExperiment. The former stores the data in a quite clumsy way, and extracting the data is a little more tricky. The latter is the new preferred result object. There is also the chromPeaks and chromPeakData method available for each of the two result objects, but these methods do quite some additional checks and allow subset/filter the content. The new internal functions (are not expected to be called by the user) are simple helpers to extract the full chromPeaks and chromPeakData content without any filtering etc.

I could also have created a method for that, but I decided against that and just added a function with an if/else inside.

Then, for chromPeakData I have also to ensure that the result is returned as a DataFrame, thus I'm calling in addition the as(..., "DataFrame". For XCMSnExp .chromPeakData will already return a DataFrame, so nothing will happen, while for XcmsExperiment I'm storing the data as a data.frame, thus I need to convert first.

Hope this explained it?

philouail

All good for me, just have 2 questions in the comments ! also thanks for your comments explaining it helped me to understand the context of everything.

jorainer added 3 commits December 14, 2023 14:32

fix: support subsetting XcmsExperiment with negative indices

2470b62

- Enable support for subsetting `XcmsExperiment` objects with negative indices (issue #707).

refactor: remove (non exported) normalization functions

6d77f7e

- Remove the not-exported normalization functions. These have been moved to the MetaboCoreUtils package.

refactor: improve performance of chromatogram,XcmsExperiment

548c248

- Improve the performance to extract EICs (along with chromatographic peaks) from an `XcmsExperiment` object.

jorainer commented Dec 20, 2023

View reviewed changes

jorainer marked this pull request as ready for review December 20, 2023 09:46

jorainer requested review from sneumann and philouail December 20, 2023 09:46

sneumann approved these changes Jan 12, 2024

View reviewed changes

philouail reviewed Jan 12, 2024

View reviewed changes

philouail approved these changes Jan 12, 2024

View reviewed changes

sneumann merged commit 548c248 into devel Jan 15, 2024
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates and fixes #709

Updates and fixes #709

jorainer commented Dec 20, 2023

jorainer Dec 20, 2023

philouail Jan 12, 2024

jorainer Jan 15, 2024

jorainer Dec 20, 2023

jorainer Dec 20, 2023

jorainer Dec 20, 2023

jorainer Dec 20, 2023

jorainer Dec 20, 2023

jorainer commented Dec 20, 2023

jorainer commented Jan 11, 2024

sneumann commented Jan 11, 2024

jorainer commented Jan 12, 2024

sneumann left a comment

philouail Jan 12, 2024

jorainer Jan 15, 2024

philouail left a comment

Updates and fixes #709

Updates and fixes #709

Conversation

jorainer commented Dec 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorainer commented Dec 20, 2023

jorainer commented Jan 11, 2024

sneumann commented Jan 11, 2024

jorainer commented Jan 12, 2024

sneumann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philouail left a comment

Choose a reason for hiding this comment