Skip to content

Commit

Permalink
feat: add queryVariables and targetVariables functions
Browse files Browse the repository at this point in the history
  • Loading branch information
jorainer committed Dec 15, 2023
1 parent 390a18f commit b3c91f2
Show file tree
Hide file tree
Showing 10 changed files with 220 additions and 36 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: MetaboAnnotation
Title: Utilities for Annotation of Metabolomics Data
Version: 1.7.2
Version: 1.7.3
Description:
High level functions to assist in annotation of (metabolomics) data sets.
These include functions to perform simple tentative annotations based on
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ export(MzParam)
export(MzRtParam)
export(ScoreThresholdParam)
export(SelectMatchesParam)
export(SingleMatchParam)
export(TopRankedMatchesParam)
export(ValueParam)
export(createStandardMixes)
Expand Down Expand Up @@ -49,9 +50,11 @@ exportMethods(matchedData)
exportMethods(metadata)
exportMethods(plotSpectraMirror)
exportMethods(query)
exportMethods(queryVariables)
exportMethods(setBackend)
exportMethods(show)
exportMethods(spectraVariables)
exportMethods(targetVariables)
importClassesFrom(CompoundDb,CompDb)
importClassesFrom(ProtGenerics,Param)
importClassesFrom(QFeatures,QFeatures)
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# MetaboAnnotation 1.7

## Changes in 1.7.3

- Add new methods `queryVariables` and `targetVariables` to extract the names
of variables (columns) of *query* and *target*.

## Changes in 1.7.2

- Update the `Spectra` objects within the package to the new versions.
Expand Down
12 changes: 12 additions & 0 deletions R/AllGenerics.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,15 @@ setGeneric("matchedData", function(object, ...)
#' @export
setGeneric("matchSpectra", function(query, target, param, ...)
standardGeneric("matchSpectra"))

#' @rdname Matched
#'
#' @exportMethod queryVariables
setGeneric("queryVariables", function(object, ...)
standardGeneric("queryVariables"))

#' @rdname Matched
#'
#' @exportMethod targetVariables
setGeneric("targetVariables", function(object, ...)
standardGeneric("targetVariables"))
82 changes: 71 additions & 11 deletions R/Matched.R
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,33 @@
#' - `filterMatches`: filter matches in a `Matched` object using different
#' approaches depending on the class of `param`:
#'
#' - `SingleMatchParam`: reduces matches to keep only (at most) a
#' single match per query. The deduplication strategy can be defined with
#' parameter `duplicates`:
#' - `duplicates = "remove"`: all matches for query elements matching more
#' than one target element will be removed.
#' - `duplicates = "closest"`: keep only the *closest* match for each
#' query element. The closest match is defined by the value(s) of
#' *score* (and eventually *score_rt*, if present). The one match with
#' the smallest value for this (these) column(s) is retained. This is
#' equivalent to `TopRankedMatchesParam(n = 1L, decreasing = FALSE)`.
#' - `duplicates = "top_ranked"`: select the best ranking match for each
#' query element. Parameter `column` allows to specify the column by
#' which matches are ranked (use LLLLLL to list possible columns).
#' The column. Parameter `decreasing` allows
#' to define whether the match with the highest (`decreasing = TRUE`)
#' or lowest (`decreasing = FALSE`) value will be selected.
#' - `ScoreThresholdParam`: keeps only the matches whose score is strictly
#' above or strictly below a certain threshold (respectively when parameter
#' `above = TRUE` and `above = FALSE`). The name of the column containing
#' the scores to be used for the filtering can be specified with parameter
#' `column`. The default for `column` is `"score"`. Such variable is present
#' in each `Matched` object. The name of other score variables (if present)
#' can be provided (the names of all score variables can be obtained with
#' `scoreVariables()` function). For example `column = "score_rt"` can be
#' used to filter matches based on retention time scores for `Matched`
#' objects returned by [matchValues()] when `param` objects involving a
#' retention time comparison are used.
#' - `SelectMatchesParam`: keeps or removes (respectively when parameter
#' `keep = TRUE` and `keep = FALSE`) matches corresponding to certain
#' indices or values of `query` and `target`. If `queryValue` and
Expand All @@ -86,17 +113,6 @@
#' is performed on the absolute value of `"score_rt"`). Thus, matches with
#' small (or, depending on parameter `decreasing`, large) values for
#' `"score"` **and** `"score_rt"` are returned.
#' - `ScoreThresholdParam`: keeps only the matches whose score is strictly
#' above or strictly below a certain threshold (respectively when parameter
#' `above = TRUE` and `above = FALSE`). The name of the column containing
#' the scores to be used for the filtering can be specified with parameter
#' `column`. The default for `column` is `"score"`. Such variable is present
#' in each `Matched` object. The name of other score variables (if present)
#' can be provided (the names of all score variables can be obtained with
#' `scoreVariables()` function). For example `column = "score_rt"` can be
#' used to filter matches based on retention time scores for `Matched`
#' objects returned by [matchValues()] when `param` objects involving a
#' retention time comparison are used.
#'
#' - `lapply`: applies a user defined function `FUN` to each subset of
#' matches in a `Matched` object for each `query` element (i.e. to each `x[i]`
Expand Down Expand Up @@ -165,10 +181,15 @@
#' are aligned, i.e. each element in them represent a matched query-target
#' pair.
#'
#' - `queryVariables` returns the names of the variables (columns) in *query*.
#'
#' - `scoreVariables` returns the names of the score variables stored in the
#' `Matched` object (precisely the names of the variables in `matches(object)`
#' containing the string "score" in their name ignoring the case).
#'
#' - `targetVariables` returns the names of the variables (columns) in *target*
#' (prefixed with `"target_"`).
#'
#' - `whichTarget` returns an `integer` with the indices of the elements in
#' *target* that match at least one element in *query*.
#'
Expand Down Expand Up @@ -757,6 +778,22 @@ scoreVariables <- function(object) {
matchescols[grep("score", matchescols, ignore.case = TRUE)]
}

#' @rdname Matched
setMethod("queryVariables", "Matched", function(object) {
query <- .objectToMatch(object@query, object@queryAssay)
cnq <- character()
if (length(dim(query)) == 2)
cnq <- colnames(query)
if (is.null(dim(query)))
cnq <- "query"
cnq
})

#' @rdname Matched
setMethod("targetVariables", "Matched", function(object) {
.cnt(.objectToMatch(object@target, object@targetAssay))
})

#' @importMethodsFrom S4Vectors cbind
#'
#' @importFrom S4Vectors DataFrame
Expand Down Expand Up @@ -1226,6 +1263,29 @@ setMethod("filterMatches", c("Matched", "ScoreThresholdParam"),
object
})

#' @noRd
setClass("SingleMatchParam",
slots = c(
duplicates = "character",
column = "character",
decreasing = "logical"),
contains = "Param",
prototype = prototype(
duplicates = "remove",
column = "score",
decreasing = TRUE)
)

#' @rdname Matched
#'
#' @export
SingleMatchParam <- function(duplicates = c("remove", "closest", "top_ranked"),
column = "score", decreasing = TRUE) {
duplicates <- force(match.arg(duplicates))
new("SingleMatchParam", duplicates = duplicates, column = column[1L],
decreasing = decreasing[1L])
}

#' @importFrom MsCoreUtils rbindFill
.addMatches <- function(query, target, matches, queryValue = integer(),
targetValue = integer(), queryColname = character(),
Expand Down
37 changes: 32 additions & 5 deletions R/MatchedSpectra.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,30 @@
#' returned for each *query* spectrum with eventual duplicated entries (values)
#' if the query spectrum matches more than one target spectrum.
#'
#' @section Creation and subsetting:
#' @section Creation, subset and filtering:
#'
#' `MatchedSpectra` objects can be created with the `MatchedSpectra` function
#' providing the `query` and `target` `Spectra` as well as a `data.frame` with
#' the
#' `MatchedSpectra` objects are the result object from the [matchSpectra()].
#' While generally not needed, `MatchedSpectra` objects can also be created
#' with the `MatchedSpectra` function providing the `query` and `target`
#' `Spectra` objects as well as a `data.frame` with the *matches* between
#' query and target elements. This data frame is expected to have columns
#' `"query_idx"`, `"target_idx"` with the `integer` indices of query and
#' target objects that are *matched* and a column `"score"` with a `numeric`
#' score for the match.
#'
#' `MatchedSpectra` objects can be subset using:
#'
#' - `[` subset the `MatchedSpectra` selecting `query` spectra to keep with
#' parameter `i`. The `target` spectra will by default be returned as-is.
#'
#' - `pruneTarget` *cleans* the `MatchedSpectra` object by removing non-matched
#' target spectra.
#'
#' In addition, `MatchedSpectra` can be filtered with any of the filtering
#' approaches defined for [Matched()] objects: [SelectMatchesParam()],
#' [TopRankedMatchesParam()] or [ScoreThresholdParam()].
#'
#'
#' @section Extracting data:
#'
#' - `$` extracts a single spectra variable from the `MatchedSpectra` `x`. Use
Expand All @@ -42,6 +54,10 @@
#'
#' - `matchedData` same as `spectraData` below.
#'
#' - `query` returns the *query* `Spectra`.
#'
#' - `queryVariables` returns the `spectraVariables` of *query*.
#'
#' - `spectraData` returns spectra variables from the query and/or target
#' `Spectra` as a `DataFrame`. Parameter `columns` allows to define which
#' variables should be returned (defaults to
Expand All @@ -62,7 +78,8 @@
#'
#' - `target` returns the *target* `Spectra`.
#'
#' - `query` returns the *query* `Spectra`.
#' - `targetVariables` returns the `spectraVariables` of *target* (prefixed
#' with `"target_"`).
#'
#' - `whichTarget` returns an `integer` with the indices of the spectra in
#' *target* that match at least on spectrum in *query*.
Expand Down Expand Up @@ -310,6 +327,16 @@ setMethod("spectraVariables", "MatchedSpectra", function(object) {
c(svq, paste0("target_", svt), cns[!cns %in% c("query_idx", "target_idx")])
})

#' @rdname MatchedSpectra
setMethod("queryVariables", "MatchedSpectra", function(object) {
spectraVariables(query(object))
})

#' @rdname MatchedSpectra
setMethod("targetVariables", "MatchedSpectra", function(object) {
paste0("target_", spectraVariables(target(object)))
})

#' @exportMethod colnames
#'
#' @rdname MatchedSpectra
Expand Down
64 changes: 51 additions & 13 deletions man/Matched.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit b3c91f2

Please sign in to comment.