Skip to content

Commit

Permalink
Merge pull request #350 from Olink-Proteomics/optimization_develop_re…
Browse files Browse the repository at this point in the history
…ad_npx_legacy

Arrow is still failing in MacOS but everything else passes.
  • Loading branch information
klevdiamanti authored May 15, 2024
2 parents 9719e67 + 6d8537f commit b88f098
Show file tree
Hide file tree
Showing 243 changed files with 6,727 additions and 253 deletions.
3 changes: 0 additions & 3 deletions OlinkAnalyze/.Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,7 @@
^Meta$
^cran-comments\.md$
^inst/extdata/npx_data2_meta[.]csv
^inst/extdata/npx_data2[.]xlsx
^inst/extdata/.*original[.]csv
^inst/extdata/.*original[.]xlsx
^data/.*_original[.]rda
^LICENSE\.md$
^revdep$
^inst/WORDLIST
Expand Down
4 changes: 2 additions & 2 deletions OlinkAnalyze/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: OlinkAnalyze
Title: Facilitate Analysis of Proteomic Data from Olink
Version: 3.4.1
Version: 3.7.0
Authors@R: c(
person("Kathleen", "Nevola", , "biostattools@olink.com", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-5183-6444", Github = "kathy-nevola")),
Expand Down Expand Up @@ -55,7 +55,7 @@ Description: A collection of functions to facilitate analysis of proteomic
License: AGPL (>= 3)
Config/testthat/edition: 3
Config/testthat/parallel: true
Config/testthat/start-first: read_npx_wide
Config/testthat/start-first: read_npx_l*, read_npx_w*
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.1
Expand Down
83 changes: 54 additions & 29 deletions OlinkAnalyze/NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,57 @@
# Olink Analyze 3.7.0
## Minor Changes
* Support for Explore 3072 data in parquet form added (#327, @kathy-nevola, @klevdiamanti)
* Support for pathway enrichment when LOD is not present added (#329, @kathy-nevola)
* Clarification to remove controls when selecting bridging samples added in bridging tutorial (#330, @kathy-nevola)
* Addition of Kristyn Chin as contributor (#331, @kathy-nevola)
* Addition of URLs and Contact information in Description (#331, @kathy-nevola)

# Olink Analyze 3.6.2
## Bug Fixes
* Packages in Suggest field are now called conditionally in vignettes, tests, and examples. (#319, @kathy-nevola)
* read_NPX will now work with Target 96 data that does not contain "Target 96" in the panel name (#320, @kathy-nevola)
* olink_lmer will no longer require the presence of an Index column in the data (#321, @kathy-nevola)
* corrected examples for subset normalization (#312 @kathy-nevola)
* added additional information on example data to outlier vignette (#313 @kathy-nevola)
* clarified documentation for longitudinal randomization (#314 @kathy-nevola)
* corrected warning message for olink_ordinalRegression (#296 @boxizhang)

# Olink Analyze 3.6.1
## Minor Changes
* Install package Matrix from source in CI. (#299 @klevdiamanti, @AskPascal)

# Olink Analyze 3.6.0
## Minor Changes
* Plate LOD will be chosen over Max LOD when both are present (#288, @kathy-nevola)
* Data from NPX Signature 1.8+ is now supported by read_NPX when in long format csv (#293, @Orbmac, @kathy-nevola)
* olink_pca_plot will now be faster (#289, @MasoumehSheikh)

## Bug Fixes
* olink_normalization and related function will now compare normalization strategy by assay (#291, klevdiamanti)
* Excluded assays will not cause warnings in olink_normalization (#291, klevdiamanti)
* olink_pca_plot will now show the same label and text for point when SampleID is numeric (#289, @MasoumehSheikh)

# Olink Analyze 3.5.1
## Bug Fixes
* read_NPX will now support additional formats of parquet files

# Olink Analyze 3.5.0
## Minor Changes
* read_NPX will now detect and import Flex excel files in wide format (@kathy-nevola, #234)
* Add example of bridge samples selector and minor updates for clarity to Intro to bridging vignette (@Orbmac, @kathy-nevola, #260)
* Increase plate randomizer unit test coverage (@amrita-kar, #264)
* Add support for parquet files in read_NPX (@kathy-nevola, #265, @klevdiamanti, #270)
* Add support for alternative forms of LOD including Max and Plate LOD (@kathy-nevola, #267)
* Add support for SampleQC column (alternative to QC_Warning) (@kathy-nevola, #267, @MasoumehSheikhi, @268)
* Add support for plate randomization with variable control numbers (@kathy-nevola, @AskPascal, #269)
* Controls can now be randomized across the plate with olink_plate_randomizer (@kathy-nevola, @AskPascal, #269)
* Minor updates for clarity to Plate Randomization vignette (@kathy-nevola, @AskPascal, #269)
* Change formatting for long format Target CSVs to match excel (@kathy-nevola, #273)

## Bug Fixes
* Project name is now consistent and generic throughout bridging vignette (@kathy-nevola, #257)
* Estimate is now included in output when Paired Mann-Whitney U test is performed (@boxizhang, #259)

# Olink Analyze 3.4.1
## Bug Fixes
* Skip PCA snapshot tests in R version > 4.2.3 (@AskPascal, #254)
Expand All @@ -18,15 +72,13 @@
## Bug Fixes
* Update to olink_wilcox documentation and UniProt description in documentation has been corrected (@boxizhang, #235)


# Olink Analyze 3.3.1
## Bug Fixes
* olink_pathway_enrichment now prints a message when there are non matching names when using method = "ORA" (@MasoumehSheikhi, #222)
* olink_pca_plot will now generate PCA when data is missing from the first OlinkID (@kathy-nevola, #221)
* read_NPX now supports csv files with Sample_Type column but not ExploreVersion column (@klevdiamanti, #220)
* extra columns in input file will no longer result in a warning message (@kathy-nevola, #223)


# Olink Analyze 3.3.0
## Minor Changes
* Support for additional versions of Olink data - Read_NPX now supports a wider range of Olink data types (@AskPascal, @kathy-nevola, #207, #208, #211, #216)
Expand All @@ -46,9 +98,7 @@
## Bug Fixes
* Change in unit test to write to temporary directory (@AskPascal, #181)


# Olink Analyze 3.2.0

## Minor Changes
* Addition of functions to perform Uniform Manifold Approximation and Projection (UMAP) dimensional reduction and plots (@simfor, #139)
* Add additional install methods to Readme (@AskPascal, #153)
Expand All @@ -59,7 +109,6 @@
* Read_NPX will now warn the user when NAs are detected in the NPX column (@AskPascal, #170)
* Friedman test interface and documentation was updated to be more intuitive (@boxizhang, #171)


## Bug fixes
* Pathway enrichment p-values are now in the correct order when plotting (@klevdiamanti, #164)
* PCAs now behave the same with any locale (@AskPascal, #173)
Expand All @@ -69,9 +118,7 @@
* vdiffr based unit tests were reactivated (@AskPascal, #172)

# Olink Analyze 3.1.0

## Minor Changes

* Non-parametric functions are now available (@boxizhang, #114, #142)
* Updated installation instructions to reflect CRAN acceptance (@kathy-nevola, #107)
* Zipped files from MyData can now be used as input for read_NPX (@klevdiamanti, #115)
Expand All @@ -86,66 +133,51 @@
* Added Kristian Hodén as contributor

## Bug Fixes

* olink_pca_plot by Panel will now show correct colors when a variable is missing (@MasoumehSheikhi, Issue #117, Commit 0f2f157)
* olink_ttest will now return a warning message if an assay has less than 2 datapoints in a group. (@marisand, #110)
* LMER class is now checked using inherits (@MasoumehSheikhi, #134)
* License was corrected to AGPL-3 (@Orbmac, #138)
* Correct output type of olink_dist_plot in vignette to ggplot object (@kathy-nevola, Issue #112, #141)
* Previously called "intensity normalization" has been clarified as a special type of subset normalization and an example has been added to the documentation and vignette (@Orbmac, #144)


# Olink Analyze 3.0.0

## Major Changes

* displayPlateDistributions and displayPlateLayout names were updated to olink_displayPlateDistributions and olink_displayPlateLayout (@simfor, #98)

## Minor Changes

* Update CI to Ubuntu 20.04 (@AskPascal, #97)
* olink_bridgeselector will now give an error if less than n (number of) bridge samples can be selected based on set sampleMissingFreq (@marisand, #100)
* update return value documentation for all functions to specify columns and class of output (@kathy-nevola, #102)
* replaced or removed cat/print with message/stop so messages printed to the console can be suppressed (@jrguess, #103)

## Bugfixes

* fixed spelling mistakes in documentation (@kathy-nevola, #92)
* updated DESCRIPTION to fit CRAN specification (@kathy-nevola, #94)
* change T/F to TRUE/FALSE for stability (@AskPascal, #96)
* updated documentation to olink_plate_randomizer, olink_displayPlateDistributions and olink_displayPlateLayout to link related functions and clarify olink_plate_randomizer documentation (@simfor, #98)
* fixed keywords in documentation (@kathy-nevola, #99)


# Olink Analyze 2.0.1

## Bug Fixes

* Remove hexagon from Readme (@kathy-nevola, #86)
* Replace OlinkAnalyze with Olink® Analyze (@kathy-nevola, #86)
* Add Ola Caster to author list
* Update documentation to change olinkR to Olink Analyze (@jrguess, #89)

# Olink Analyze 2.0.0

## Major Changes

* Package is in the process of being submitted to CRAN
* Added ability to prevent OlinkAnalyze from loading fonts and setting to themes (@OlaCaster, #73)
* Decreased the size of npx_data1 and npx_data2 to 2 panels instead of 12 (@kathy-nevola, #76)
* Updates to the olink_qc_plot and olink_pca_plot functions (@simfor, #78)

## BugFixes

* Moved NEWS.md to correct level (@kathy-nevola, #74)
* Added cran-comments.md file to document Notes for CRAN submission (@kathy-nevola, #74)
* Update Vignette to reflect new functionality (@kathy-nevola, #78)


# Olink Analyze 1.3.0

## Major Changes

* DESCRIPTION file updated to include all authors and maintainers (@kathy-nevola, #66)
* Unit testing was added (@marisand, @simfor, #65, #55, #47)
* Continuous Integration (CI) was added (@AskPascal, #57, #53, #40, #31)
Expand All @@ -155,7 +187,6 @@
* Changed `olink_pal()` to have gray instead of light blue (@marisand, #22)

## Bug Fixes

* `set_plot_theme()` will now load Swedish Gothic Thin if available on all OS (@marisand, @AskPascal, #70, #39)
* Help documentation was updated to correct typos and clarify notation for ANOVA and LME models (@kathy-nevola, @marisand,@AskPascal #67, #33, #37)
* Fix `olink_dist_plot()` from showing multiple bars when sample has QC warning in some assays (@marisand, @AskPascal, #64)
Expand All @@ -167,9 +198,3 @@
* Fix summary casting "tukey" to "sidak" adjustment warning in `olink_anova_posthoc()` and `olink_lmer_posthoc()` functions (@marisand, #38)
* Update functions to import selectively (@kathy-nevola, @Orbmac, @AskPascal, @marisand, #21, #20, #19, #18, #15, #29)
* Fix guides size argument in `olink_pca_plot()` (#17) (@AskPascal, #24)






47 changes: 47 additions & 0 deletions OlinkAnalyze/R/check_file_exists.R
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,50 @@ check_file_exists <- function(file,
}

}

#' Help function checking if file extension is acceptable.
#'
#' @author Klev Diamanti
#'
#' @param file Path to the file.
#'
#' @return The type of the file extension based on the global variable
#' `accepted_npx_file_ext`
#'
check_file_extension <- function(file) {
# check input ----

check_is_scalar_character(string = file,
error = TRUE)

# get file extension ----

# get the extension of the input file
f_ext <- tools::file_ext(x = file)

# check what type of label the extension of the input matches to
f_label <- accepted_npx_file_ext[accepted_npx_file_ext == f_ext] |>
names()

# check if file extension is applicable ----

# if the extension of the input file was within the accepted ones it should
# be a scalar character
if (!check_is_scalar_character(string = f_label, error = FALSE)) {

cli::cli_abort(
message = c(
"x" = "Unable to recognize the extension of the file {.file {file}}!",
"i" = "Expected on of {.val {accepted_npx_file_ext}}!"
),
call = NULL,
wrap = FALSE
)

}

# return ----

return(f_label)

}
76 changes: 38 additions & 38 deletions OlinkAnalyze/R/read_npx.R
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@
#' @param quiet Boolean to print a confirmation message when reading the input
#' file. Applies to excel or delimited input only. `TRUE` (default) to not print
#' and `FALSE` to print.
#' @param legacy Boolean to run the legacy version of the read_npx function.
#' \strong{Important: applies only to wide format file from Target 96 or Target
#' 48 with NPX Software version earlier than 1.8!}. Default is `FALSE`.
#'
#' @return Tibble or ArrowObject with Olink data in long format.
#'
Expand Down Expand Up @@ -77,7 +80,8 @@ read_npx <- function(filename,
olink_platform = NULL,
data_type = NULL,
.ignore_files = c("README.txt"),
quiet = TRUE) {
quiet = TRUE,
legacy = FALSE) {

# check input ----

Expand All @@ -88,64 +92,60 @@ read_npx <- function(filename,
# check that the requested putput df is ok
check_out_df_arg(out_df = out_df)

check_is_scalar_boolean(bool = legacy,
error = TRUE)

# sep and .ignore_file are checked in the functions they target

# check file extension ----

# get the extension of the input file
f_ext <- tools::file_ext(x = filename)

# check what type of label the extension of the input matches to
f_label <- accepted_npx_file_ext[accepted_npx_file_ext == f_ext] |>
names()
f_label <- check_file_extension(file = filename)

# read data ----

# if the extension of the input file was within the accepted ones it should
# be a scalar character
if (check_is_scalar_character(string = f_label, error = FALSE)) {
if (grepl(pattern = "excel|delim", x = f_label)) {
# Input is an excel or a delimited file

if (grepl(pattern = "excel|delim", x = f_label)) {
# Run legacy read_npx function
if (legacy == TRUE) {

# Input is an excel or a delimited file
df_olink <- read_npx_format(file = filename,
df_olink <- read_npx_legacy(file = filename,
out_df = out_df,
sep = sep,
long_format = long_format,
olink_platform = olink_platform,
data_type = data_type,
quiet = quiet)

} else if (grepl(pattern = "parquet", x = f_label)) {
} else {

# Input is a parquet file
df_olink <- read_npx_parquet(file = filename)
df_olink <- read_npx_format(file = filename,
out_df = out_df,
sep = sep,
long_format = long_format,
olink_platform = olink_platform,
data_type = data_type,
quiet = quiet,
legacy = FALSE)

} else if (grepl(pattern = "compressed", x = f_label)) {
}

# Input is a zip-compressed file
df_olink <- read_npx_zip(
file = filename,
out_df = out_df,
sep = sep,
long_format = long_format,
olink_platform = olink_platform,
data_type = data_type,
.ignore_files = .ignore_files,
quiet = quiet
)
} else if (grepl(pattern = "parquet", x = f_label)) {

}
# Input is a parquet file
df_olink <- read_npx_parquet(file = filename)

} else {
} else if (grepl(pattern = "compressed", x = f_label)) {

cli::cli_abort(
message = c(
"x" = "Unable to recognize format from file extension!",
"i" = "Acceptable file extensions: {accepted_npx_file_ext}"
),
call = NULL,
wrap = FALSE
# Input is a zip-compressed file
df_olink <- read_npx_zip(
file = filename,
out_df = out_df,
sep = sep,
long_format = long_format,
olink_platform = olink_platform,
data_type = data_type,
.ignore_files = .ignore_files,
quiet = quiet
)

}
Expand Down
Loading

0 comments on commit b88f098

Please sign in to comment.