Major changes:
- Updated dependencies to Bioconductor 3.18. These packages are now available on Bioconda.
Minor changes:
import
: Reworked GAF file method to simply import as a 17 columndata.frame
. Refer to the Gene Ontology website for details on the GAF file format, which is basically a TSV files with commented lines beginning with"!"
.- Removed BaseSet as a suggested import. The
getGAF
function doesn't work on most of the GAF files available from the Gene Ontology website currently. It only works reliably with the Homo sapiens protein annotations, which isn't general enough for our needs currently.
Minor changes:
import
: Added support for Gene Ontology (GO) annotation files (GAF). Useful for importing data from the Gene Ontology Annotation (GOA) database.- Miscellaneous documentation improvements for
import
.
Major changes:
export
now supportslist
and S4VectorsList
, for recursive export of supported objects. We may use this method in for other classes, such asSummarizedExperiment
in AcidExperiment package in a future update.export
:GRangesList
export method now inherits fromList
, and writes each object in list as a separate file to disk, rather than grouping in adata.frame
coercion step first.
Minor changes:
export
:GRangesList
method is now defined and split out fromdata.frame
method, to handle edge case of undesirable column name coercion.- Resolve new lints detected by lintr 3.1.1.
Major changes:
import
now supports import of JSON and YAML lines viatextConnection
. This is very useful for importing return from CLI tools, such asaws
.
Minor changes:
import
: Now usingread_yaml
internally instead ofyaml.load_file
from the yaml package.
Minor changes:
import
: Improved edge case handling of datasets with invalid names.as.data.frame
: Improved internal coercion code forMatrix
andIRanges
.as.DataFrame
: Ensure we setoptional = TRUE
internally duringas.data.frame
call.
New functions:
fillLines
: Utility function for fixing malformed CSV and TSV files.unfactorize
: New generic function that intelligently converts a factor back to its original atomic data type.
Major changes:
- The
import
andexport
generics are now defined in AcidGenerics rather than extending from BiocIO. This helps simplify the methods, removing options that we never use. - Now enforcing strict camel case for all function names.
- The pipette file classes are now named in strict upper camel case.
- Renamed
cacheURL
tocacheUrl
. - Renamed
getJSON
togetJson
. - Renamed
getURLDirList
togetUrlDirList
. - Renamed
pipetteTestsURL
topipetteTestsUrl
. - Renamed
removeNA
toremoveNa
. - Renamed
sanitizeNA
tosanitizeNa
.
Major changes:
getURLDirList
: Entirely reworked internal code to no longer depend on RCurl package. Added support for HTTP(S) servers, which has been tested to work for Ensembl and NCBI.import
: Now supportstextConnection
class for primarycon
argument instead of always usingcharacter
representing a file path. This is incredibly useful for reformatting a malformed delimited file on a remote server prior to import, which can help eliminate our reliance on readr or data.table as alternative engines to handle malformed files.transmit
: Reworked internal code to no longer depend on RCurl.
Minor changes:
export
: Now supportsGRangesList
, which first coerces todata.frame
. This class includes"group"
and"groupName"
as the first columns upon export.
getURLDirList
: Tighten up assert checks to intentionally error if input does not contain an FTP server URL.
Major changes:
- After inspecting the package with
packageDependencies
from AcidDevTools, decided to revert the changes introduced in 0.12.0 back to making more heavy dependencies optional. We are now no longer requiring BiocFileCache, data.table, digest, httr2, readr, and rtracklayer to be installed. This helps speed up attachment of the package, where it is most commonly used to simply import and export CSV files as the primary utility. - These package changes will not affect the bioconda recipe, as all optional dependencies are bundled with the recipe.
Minor changes:
getURLDirList
: Need to moveisAnExistingURL
assert check down, to not break Ensembl FTP server checks in AcidGenomesmakeGRangesFromEnsembl
.
Minor changes:
as.DataFrame
: Fixed support for nesting ofDFrame
objects.droplevels2
: Ensure that original class and object metadata are preserved.
Major changes:
- Migrated a number of dependency packages from
Suggests
toImports
(seeDESCRIPTION
andIMPORTS
files) to avoid issues with missing dependencies in some commonly used Acid Genomics software. Here we are now requiring BiocFileCache, data.table, digest, httr2, jsonlite, readr, rtracklayer, and yaml as standard packages in pipette, so that they get automatically installed. This change only applies when managing dependencies using R directly and doesn't affect bioconda recipes. - Hardened some internal assert checks using
isAnExistingURL
instead of simplyisAURL
. This new function available in the goalie package actively checks to see if the URL exists and is active. Note that for FTP directories, such as with thetransmit
function, theisAnExistingURL
check only works if there is a trailing slash.
Minor changes:
- Disabled examples using the Ensembl REST API server (
"rest.ensembl.org"
), as this has recently been flaky and can cause build checks to time out. - Miscellaneous unit test fixes, now using
isAnExistingURL
to check.
Minor changes:
import
: Disabledlazy
loading mode for readr. Enabling this makes it basically impossible to suppress warnings when parsing malformed files, such as in theMGI
function of AcidGenomes.- Enabled parallel testthat unit tests.
Minor changes:
getJSON
: Improved code coverage for parsing of Ensembl REST API.
Major changes:
- Now requiring R 4.3 / Bioconductor 3.17.
- File classes for
import
andexport
are now prefixed withPipette
, to avoid unwanted collisions with other classes defined in Bioconductor packages. import
: Added support for MAF files.import
: Added support for BAM, CRAM, and SAM files. Note that CRAM files must be able to resolve the corresponding reference genome.import
: Added support for BCF and VCF files. Note that these files currently require corresponding CSI index files, which can be generated using htslib.- Temporary files generated during
import
calls (i.e. automatic decompression of compressed files, such asgz
orxz
) are now automatically cleaned up. This change should only affect temporary files. If you notice any issues with this, please file a bug report! getJSON
: Reworked to use httr2 instead of deprecated httr package.
Major changes:
import
: Improved support for FASTA files. The function will now intentionally error on any warnings, which can occur if themoleculeType
argument is not set correctly. For example, we're defaulting to DNA input here, but that is not always the case. miRBase FASTA files are in RNA format, so usemoleculeType
argument here to set"RNA"
instead of"DNA"
. We've also added amino acid support via"AA"
argument, which passes to Biostrings package in a similar method for DNA and RNA. Note that for miRBase FASTA files, we now attempt to get metadata from the FASTA identifiers, which are defined as aDataFrame
slotted intometadata
asattributes
.
Minor changes:
import
: Updateddata.table
engine settings to useheader = "auto"
whencolnames
argument is declared asTRUE
. This will attempt to handle malformed columns, similar to readr/vroom approach.- Improved internal usage consistency of goalie boolean
grepl
wrapper functions, such asisMatchingFixed
andisMatchingRegex
, which improves code legibility. - Removed legacy Docker usage instructions in README.
Minor changes:
- Renamed S4 class definitions:
DataFrame
toDFrame
;GenomicRanges
toGRanges
; andIntegerRanges
toIRanges
.
Minor changes:
import
andexport
now support overriding default quote handling with thequote
argument. Changing this is not recommended by default, but is useful for some edge cases with annoying gene metadata files from Ensembl. Who decided thatB"
is an acceptable gene name?
Minor changes:
import
: Switched default quoting for delim file using base engine. Also removed empty string assert checks, which can be problematic for some data types returned by readr and data.table engines.
Minor changes:
- Added line break to
naStrings
handling, which improves automatic sanitization of some malformed gene CSV files on the Ensembl FTP server.
Minor changes:
- Reexporting
initDir
andpasteURL
from AcidBase. - Improved
pattern
documentation forgetURLDirList
.
Minor changes:
-
as.DataFrame
: Reworked thelist
andSimpleList
methods:- We removed the option to set
row.names
forlist
andSimpleList
methods. Instead, row names will get defined automatically when the first element in the list is named. - The
list
method now checks for length mismatched input at the start of the function. This should also be handled in the downstreamDataFrame
generator step, but we can set a custom error message instead. - We're now calling
new
internally withlistData
andnrows
defined, which is faster and less problematic then attempting to construct a matrix (viacbind
) and then coerce toDataFrame
instead. - Our
list
method now attempts to reorder named elements when possible.
- We removed the option to set
-
decode
: The internal code forDataFrame
method has been simplified, due to rework of our internalas.DataFrame
method code. Note that list elements are now unname during the decoding step, but has no effect on row name handling. -
droplevels2
: HardenedDataFrame
method to properly coerceDataFrame
toSimpleList
before proceeding. Usingas
method with"List"
class does not currently unclassDataFrame
as expected.
Minor changes:
- Migrated
requireNamespaces
import in NAMESPACE from AcidBase to goalie. - Updated package dependencies.
Minor changes:
factorize
: Bug fix to ensure thatlogical
columns are not coerced to factor, as these are expected to have repeated values.- Now requiring Bioconductor 3.16 release.
Major changes:
import
: Added support for import of gene cluster text (GCT) file format. Currently can setreturn
parameter to import as amatrix
(default) ordata.frame
withName
andDescription
columns retained.
Major changes:
- This release attempts to harden the primary arguments supported for
export
andimport
. Note thatfile
is intentionally not supported in favor ofcon
, which is the preferred convention used in BiocIO. - Reduced the number of exported methods, tightening on
missing
instead of usingmissingOrNULL
, which can cause unwanted inheritance issues that collide with BiocIO package. export
now supports any type ofatomic
vector.
Minor changes:
factorize
: Improved handling ofDataFrame
input that contains complex S4 columns, such asCharacterList
.
Major changes:
factorize
: Reworked internal method to only coerce columns with repeated values to factor. Now usesanyDuplicated
internally to check for which columns to factorize.
Minor changes:
import
: Hardened the internal readr engine to use basemake.names
for column name repair for delimited file import (e.g. CSV, TSV). This also eliminates unwanted CLI messages about name repair when using readr.
Minor changes:
export
: Bug fix to provide compatibility inMatrix
method for export of sparse matrices without dimnames (e.g. row and/or column names). Hit this edge case in the update of example data in our Chromium package.
Minor changes:
- Updated and hardened unit tests to avoid file system lock issues on Windows related to readr engine.
Minor changes:
atomize
: Hardened edge cases of emptyDataFrame
andGRanges
input. Also improved code coverage to test handling of these edge case events, which can occur when exporting metadata fromSummarizedExperiment
in upstream AcidExperiment and bcbioRNASeq packages for objects with minimal metadata.
Minor changes:
- Updated lintr checks and testthat unit tests.
Minor changes:
import
: Added support for Open Biomedical Ontologies (OBO) format. This uses ontologyIndex internally. See also BiocSet for alterative import method that useses_set
and other accessor functions.
Major changes:
- Switched primary import/export engine from readr back to base, to avoid
strong dependency on readr package. Note that both readr and/or data.table
packages can optionally be used for import/export by specifying
"engine"
. - Removed coercion method support for
as_tibble
andas.data.table
, to remove strong dependencies on data.table and tibble packages. - Removed
as
coercion support in favor of simply usingas.DataFrame
S4 generic approach, to avoid potential conflicts with Bioconductor. as.DataFrame
: Reduced the number of supported classes, removing support of non-Bioconductor classes, specificallydata.table
andtbl_df
.- Removed re-exported functions:
column_to_rownames
,getURL
,rbindlist
,rownames_to_columns
, andtibble
.
Minor changes:
- Package code is now formatting using styler package.
- Now exporting method support in
droplevels2
instead ofdroplevels
, to avoid method conflict with new Bioconductor 3.15 update. - All
requireNamespaces
calls are now wrapped byassert
. - Removed previously deprecated functions that are no longer in use. These were
previously defined in
deprecated.R
:sanitizeColData
,sanitizeRowData
,sanitizeRowRanges
, andwriteCounts
.
Major changes:
- No longer reexporting any functions or S4 classes.
export
,import
: Switched to recommended new BiocIO generic approaches, which are now used in multiple Bioconductor packages, notably rtracklayer.export
,import
: Switched back to readr as default engine from data.table. The data.tablefread
andfwrite
functions have been shown to generate stack imbalances in some edge cases that readr handles better.export
: Current default recommended method now dispatches on"object"
,"con"
, and"format"
arguments. Previous methods that dispatch using"format"
,"dir"
,"ext"
, and/or"file"
(adapted from the conventions used in rio package) arguments are soft deprecated, but should still currently work. If you encounter any breaking changes here, please file an issue!import
: Current default recommended method now dispatches on"con"
,"format"
, and"text"
arguments. We have defined our methods to simply handle the"con"
argument as a file path. Note that the"text"
argument can be useful for passing in raw lines of a particular file format, but we are not currently supporting that edge case in any of our import methods yet. Such functionality may be added in a future update.export
data.frame
/DataFrame
methods now attempt to coerce nestedlist
columns tovector
viatoString
internally when possible.
Minor changes:
import
: AddedrownameCol
argument for import of delimited files, such as CSV, TSV, and Excel. This is a non-breaking change that enables the user to manually define the rowname column upon import, which can be useful when working with files from public databases such as GEO.- Updated suggested readr and vroom versions, now that they're available on Bioconda.
Major changes:
- Switched
bapply
import from AcidBase to goalie. - Updated internal basejump dependencies.
- Improved installation instructions.
Major changes:
- Updated minimum Bioconductor release to 3.13.
import
: Added support for FASTA and FASTQ files, which are loaded via Biostrings package internally. Refer toreadDNAStringSet
for details.import
: ImprovedmakeNames
consistency for methods that import two-dimension arrays (e.g.data.frame
).
Minor changes:
- Improved package error messages and other alerts with AcidCLI update. Instead
of calling
stop
internally, now usingabort
, which supports stylized messages (via cli package). transmit
: Hardened working example against NCBI FTP server failure.- Package is now back to 100% code coverage, with improved coverage of primary
import
andexport
functions.
Minor changes:
sanitizeNA()
: Reworked internal call to usefactor
instead ofas.factor
followed by a separatelevels
call. This can result in an unwanted value swap for factors with a single value. Added code coverage to check for this.- Updated package dependency versions.
Minor changes:
localOrRemoteFile
: Improved handling of URLs without a file extension.import
/export
: Simplified the appearance of file variables, making them easier to copy directly from the console for debugging.
Minor changes:
import
: Ensure thatdata.table
engine always interprets empty strings (""
) asNA
. Thefread
function is opinionated about this and doesn't currently respect""
input as anNA
string.decode
/encode
: Ensure we're not dropping metadata here.
Major changes:
import
now uses S4 methods based on thefile
argument, which can be manually overridden with theformat
argument (e.g. "csv" for a CSV file).import
andexport
now supportengine
argument for methods dispatching oncharacter
anddata.frame
. Currently "base" (base R), "data.table", "readr", and "vroom" are supported.- Source code lines now default to using base R for import/export by default. Previously, this defaulted to data.table, which is now only used as the default for import of delimited files.
Minor changes:
import
: Added support forremoveBlank
andstripWhitespace
for import of source code lines (LinesFile
).import
: Improved import handling of source code lines using data.table.
Minor changes:
decode
/encode
: Reworked internal code slightly to provide compatibility with S4Vectors update in Bioconductor 3.13.
Major changes:
as.DataFrame
: Updatedlist
toDataFrame
coercion support that is compatible with Bioconductor 3.13 release update.cacheURL
: Now usingtools::R_user_dir
instead ofrappdirs::user_cache_dir
internally. This matches the conventions used in Bioconductor 3.13 (e.g. AnnotationHub and BiocFileCache).
Minor changes:
- Documentation updates, to pass build checks without warnings on R 4.1 and Bioconductor 3.13.
Minor changes:
cacheURL
now internally calls BiocFileCache and rappdirs as suggested packages, rather than direct imports. This helps keep the package a bit lighter and improve loading times, as BiocFileCache currently calls a number of heavy dependencies, including dplyr.- The checksum functions
md5
andsha256
now calldigest
internally as a suggested package, rather than a direct import.
Major changes:
- Switched the default engine back to data.table package for
import
andexport
functions.
Minor changes:
- Improved consistency of arguments for internal engines (base, data.table,
readr, and vroom) for
import
andexport
of delimited files. - Base engine now uses
read.table
andwrite.table
instead ofread.csv
andwrite.csv
for CSV files. - The optional readr engine now uses
read_delim
andwrite_delim
, similar to the update for base engine. - Improved code coverage of base engine for import/export.
Minor changes:
import
: Hardened importer against unexpected mismatch when user attempts to manually define column names viacolnames
argument. Some importers such asvroom
are currently too liberal about mismatches.
Minor changes:
- Including
seqnames
as a reexport, which is defined in GenomeInfoDb via GenomicRanges.
Minor changes:
as.DataFrame
: Restrict list-based coercion tolist
andSimpleList
(instead ofList
virtual class).
Minor changes:
- Include
CompressedGRangesList
from GenomicRanges as a reexport.
Minor changes:
- Migrated IRanges reexports to AcidGenerics.
Minor changes:
- Updated
naStrings
to include"-"
,"_"
, and" "
.
Minor changes:
- Minor rework and simplification of NAMESPACE, inheriting from AcidGenerics and AcidBase when possible.
Minor changes:
- Reexporting additional useful functions and classes from IRanges, including
AtomicList
virtual class. - Migrated S4Vectors reexports to AcidGenerics, including
Annotated
,Factor
,Factor
,LLint
, andRectangularData
classes.
Minor changes:
- Including more reexports of useful S4 classes and functions defined in IRanges
that we will reexport in basejump:
CharacterList
,FactorList
,IntegerList
,LogicalList
,NumericList
,RleList
.
Minor changes:
- Reexporting some additional functions and classes from GenomicRanges, IRanges, Matrix, and S4Vectors that we can inherit in basejump.
New functions:
- Added
md5
andsha256
functions, that use the digest package internally. Previously these were defined in the AcidGenomes package, but have migrated here for file management consistency.
Major changes:
cacheURL
: Fix bug that resulted in different remote URLs with the same base name being cached as the same object internally by BiocFileCache. Now URLs should always be cached uniquely.
Minor changes:
- Migrated test data from "tests.acidgenomics.com" to "r.acidgenomics.com/testdata".
- Added
droplevels
method forDataFrame
.
Minor changes:
cacheURL
: Improved message. Now only showing when file gets cached into package cache via BiocFileCache. Now defaults to caching into BiocFileCache directory instead of pipette.- Now exporting useful
rbindlist
function from data.table. - Renamed
matchRowNameColumn
tomatchRownameColumn
(note case).
Minor changes:
naStrings
: Now including lowercaseNA
variants, which are seen in some files on RefSeq FTP server.sanitizeNA
: Updated to also match lowercaseNA
patterns.
Major changes:
import
: Improved internal engine support for plain text delimited (e.g. CSV, TSV) files and source code lines. The vroom engine remains enabled by default but data.table, readr, and base R are consistently supported for import of either delimited files or source code lines.export
: Added supported for base R export of CSV and TSV files. Internal engine consistency has been improved forcharacter
andmatrix
/data.frame
methods. Note thatcharacter
method currently falls back to using readr'swrite_lines
function instead of attempting to use the vroom package by default.
Minor changes:
getURLDirList
now returns sorted.- Reexporting
url.exists
function from RCurl, for convenience.
Minor changes:
- Switched from using cli package internally to AcidCLI.
Minor changes:
- Made some previous imports conditional suggested packages: jsonlite, readr, rtracklayer, yaml.
export
: Improved internal file name handling for CLI messages.- Reworked rtracklayer as a suggested package instead of an explicit import.
import
: Improved internal bcbio counts importer code to use default TSV method, rather than relying on data.tablefread
function manually.- Removed dependency on readr. Import of lines now uses
vroom::vroom_lines
internally, andexport
character method will conditionally switch to using basewriteLines
if the readr package is not installed. - Removed Matrix
readMM
andwriteMM
as imports. - Removed data.table
fread
andfwrite
as imports. - Removed BiocGenerics dependency, in favor of rexports defined in AcidGenerics. This helps keep the number of dependencies declared in the package more compact and manageable.
export
: Bug fix for handling of GZ file name extension forcharacter
method.- Bug fix for
vroom_lines
error import error of bcbio log:Unnamed col_types must have the same length as col_names
. - Now including additional reexports from data.table and tibble packages.
- Bug fixes for
import
/export
of tx2gene file handling for pending AcidGenomes package update. import
: Hardened againstmakeNames
usage on objects that don't support names assignment.
Minor changes:
export
: Added option to intentionally not export column and/or rownames formatrix
,data.frame
, andDataFrame
classes.
New functions:
- Added
download
, which acts as a hardened wrapper forutils::download.file
. Annoying,download.file
returns status codes but does not intentionally error on any unsuccessful downloads. Our wrapper ensures that R always errors on any file download issue. It also sets a longer timeout internally, to avoid any potential issues with thetimeout
option being defined inRprofile
.
Minor changes:
- Updated dependency versions.
export
: Addedappend
option forcharacter
method. Also relaxed checks oncharacter
method, allowing for exporting of empty vectors.
Minor changes:
transmit
: Addeddownload
argument support, to optionally return matching URLs without downloading. This is useful for handing off tocacheURL
function for caching files inside of packages with BiocFileCache.
Minor changes:
- Bug fix for breaking change in readr v1.4 release. In the
write_*
functions, includingwrite_lines
andwrite_csv
, thepath
argument has been renamed tofile
. Now requiring readr v1.4+ in pipette.
New functions:
- Migrated
transmit
here from basejump.
New functions:
getURLDirList
: Migrated this function from previous definition in basejump, so we can inherit inside new AcidGenomes package.
Minor changes:
cacheURL
: Addedpackage
argument, so other packages that use this function will automatically inherit the current package, as expected.
New functions:
cacheURL
: Utility function for easy package file caching using BiocFileCache package internally.
Minor changes:
- Updated Acid Genomics package dependencies.
Minor changes:
sanitizeNA
: Added support for "N/A" string, which is present in some Excel spreadsheets.
Minor changes:
export
: Now dropping non-atomic columns (e.g. Entrez ID list column) from data frames automatically prior to export. Previously, theallAreAtomic
assert check was called automatically and would error on non-atomics.
Minor changes:
import
: Improved messages to always resolve full path to import directory.- Bug fix for AppVeyor CI config.
Minor changes:
export
: Ensuring that full directory path is always resolved in message.- Miscellaneous message improvements, related to internal
toString
handling.
Minor changes:
- Relax name checks for
import
.
Minor changes:
- Decreased data.table dependency from 1.13.0 back to 1.12.8, so we can build successfully on bioconda.
Minor changes:
export
: Improved messaages to include full output path.- Increased minimum R dependency to 4.0.
Minor changes:
import
Hardened Excel input to intentionally error on any warnings returned by internalread_excel
call, which is too liberal in coercing data types, in my opinion.import
: Switched XLS parser from gdata (no longer updated) back to readxl, which is more actively developed.import
: AddedmakeNames
argument, to override default internal handling. This allows the user to apply snake case and/or camel case formatting automatically with this argument.
Minor changes:
import
: Added support forskip
argument, which allows the user to skip a certain number of lines in the input.import
andexport
of source code lines now uses readr package internally (read_lines
andwrite_lines
) instead of basereadLines
andwriteLines
.
Minor changes:
import
: Now setting delim internally forvroom
import call, to handle single column data frame import. Otherwise vroom will warn about failing to detect expected delimiter.
Major changes:
import
andexport
functions now default to using vroom engine instead of data.table. Internally, these now callvroom
andvroom_write
. We noticed that the data.tablefwrite
function in particular can have issues writing many files on AWS EC2 instances, resulting in a stack imbalance. The vroom package seems to be more stable currently.
Minor changes:
loadData
,saveData
: Switched to usingcli_alert
instead ofcli_text
internally for status messages.
Minor changes:
import
: Bug fix forformat
argument erroring on some supported file types.
Minor changes:
droplevels
: Ensuring S4 generic variant defined in S4Vectors package gets reexported and masks base S3 generic. This helps avoid aC stack usage
issue that has popped up in the latest version of R.import
: Fix for importing JSON files without extension. Can now declare using theformat
argument. This fix was needed to import GitHub JSON URLs inside newinstallGitHub
function defined in bb8 package.
Minor changes:
- Switched license from MIT to GPL-3.
- Ensure that
coerce
method is reexported -- thanks @dpryan79 for catching this issue in basejump.
Major changes:
- Renamed package from brio to pipette, in preparation of CRAN submission.
export
: Reworked internal methods to use newcompress
anddecompress
functions defined in acidbase package.localOrRemoteFile
: Reworked to use newdecompress
defined in acidbase internally, with improvedtempfile
handling.- Migrated coercion methods and other utitilies from the now archived
transformer package:
atomize
,coerceToList
,droplevels
,decode
,encode
,factorize
,matchRowNameColumn
, andmetadata2
.
Minor changes:
- Switched to using cli package for improved messages.
Minor changes:
import
: Addedmetadata
parameter option and improved internaltryCatch
handling if call capture fails. This can be the case when nesting the function inside another function, which can causestandardizeCall
to fail. Note thatmatch.call
doesn't have this problem but doesn't consistently expand the call with default formals as well.
Minor changes:
export
: Removed internal dependencies onas_tibble
andas.data.table
calls. The tibble package recently changed the default row name handling behavior inas_tibble
, which broke the code here. I reworked the internal code to only use base R approaches, so changes in the tidyverse no longer affect the package.
Minor changes:
- Updated package dependencies to require Bioconductor 3.10 release.
Minor changes:
- Improved internal metadata handling using new
metadata2
function.
Minor changes:
- NAMESPACE updates to support migration of some low-level functions to the new acidbase package.
New functions:
getURLDirList
: Return a simple character vector of files and subdirectories in a remote directory. Intended for use with FTP servers.- Also now reexporting the
getURL
function from RCurl.
Minor changes:
import
: Defaultformat
argument has been renamed from "none" to "auto".localOrRemoteFile
: Improved handling for remote URLs without a file extension.pasteURL
: Now smartly strips trailing slashes prior to internal paste call.
naStrings
: Reverted back to including only "NA" and "NULL", instead of including empty space strings. This results in unwanted messages regarding strip whitespace from data.tablefread
function.
Major changes:
-
Added back internal support for readr package instead of data.table for
import
andexport
functions. We have observed stack imbalance and segfault memory dump issues with the latest data.table release (v1.12.4) on multi-core Azure VMs. The engine can now be changed using global options:- import:
acid.import.engine
("data.table" or "readr"). - export:
acid.export.engine
("data.table" or "readr").
This new addition is experimental and may be dropped in a future release. We find that readr currently works more reliably for export in some cases for large CSV files, but data.table is generally faster and more robust for data import of CSV and TSV files. We're intentionally keeping this functions simple and not providing a user-facing argument for selecting the internal engine.
- import:
Minor changes:
- Converted unnecessary exported global variables into internal globals:
compressExtPattern
,extPattern
,rdataExtPattern
,rdataLoadError
. - Updated
naStrings
to include empty whitespace.
Minor changes:
- Updated data.table and rio dependencies, based on recent data.table 1.12.4 update, which has a lot of changes.
- Fixed internal code to no longer show rownames message, even when
rownames = FALSE
.
Major changes:
import
: Improved metadata stash approach inside S4 (metadata
) and S3 (attributes
) return objects. Previously, import metadata was stashed inside "brio", but this has been renamed to "import". Metadata is no longer stashed inside R data objects loaded viaimport
. Simplified the internal code insideimport
to only stash thecall
, whereas importer metadata is now handled by an internal.defineImportMetadata
function.
Minor changes:
export
: Improved default extension documentation and internal argument matching viamatch.arg
. Formatrix
files exported simply withdir
argument, this will default to CSV format. For sparseMatrix
files exported simply withdir
argument, this will default to MTX format.
Minor changes:
import
: Improved internal arguments passed todata.table::fread
.
- Updated R dependency to 3.6.
Major changes:
export
: Reworked internal code to calldata.table::fwrite
directly, rather than having to pass tobrio::export
. Added support forbz2
output formatrix
andsparseMatrix
classes.
Minor changes:
- No longer warn on MTX import without sidecar rownames and colnames files.
import
: Don't attempt to slot attributes for atomic vector return. This applies to source code lines and helps avoid valid name issues when assigning these values to colnames or rownames. I came across this issue while updating the Chromium package to assign names from 10X Genomics sidecar files.
Minor changes:
- Reworked organization and naming of internal importer functions. Now including
these in the
import
documentation, for clarity. import
: Addedformat
andsetclass
arguments.- Switched to
data.table::fread
for import of bcbio count matrix. Previously, was usingreadr::read_tsv
. - Import of lines no longer stores brio attributes.
- Reduced number of package dependencies, no longer requiring readr.
Minor changes:
- Added support for
url
calls toimport
,localOrRemoteFile
, andloadRemoteData
. - Improved message consistency.
- Updated basejump dependency versions.
Minor changes:
- Now using acidroxygen package to manage shared roxygen documentation params.
Minor changes:
import
: Bug fix for invalid objects (e.g. S4 objects that inheritSummarizedExperiment
) not returning silently. Had to convert thetry
call to atryCatch
call to avoid errors popping up during name checks.import
: Now suppressingpartial match of 'OS' to 'OS.type'
warning for import of XLS files, which is a bug in gdata package.
Bumped version number to reflect changes in basejump dependencies.
Minor changes:
- Improved naming consistency of internal functions.
- Updated package dependency versions.
Minor changes:
- Improvements to Travis Docker and AppVeyor CI checks.
Minor changes:
factorize
: Tightened up method support. Now exportingDataFrame
only.- Bug fix for
acid.data.frame
global option support. sanitizeNA
: Improved factor return.- Improved code coverage.
Major changes:
loadData
,loadDataAsName
, andloadRemoteData
now support overwrite argument. The default behavior of these functions has changed to allow overwriting into the environment by default, matching the baseload
function conventions. If this behavior is undesired, setoptions(acid.overwrite = FALSE)
and this will be inherited in all calls.- Renamed
acid.export.overwrite
andacid.save.overwrite
to simply useacid.overwrite
global for IO functions. This was modified now thatloadData
also supports theoverwrite
argument.
Major changes:
import
: Removed Google Sheets support. Thegooglesheets
R package is currently too buggy, and the replacementgooglesheets4
package isn't stable. This functionality may be added back in a future update.
Introducing breaking changes to export
method. Now using object
instead of
x
as the primary argument, and defaulting to the use of ext
and dir
instead of recommending the use of file
, as is the convention in the brio
package. This makes interactive file export quicker and more intuitive,
involving less repetitive variable declarations.
Minor changes:
- S4 generic reexport documentation fixes.
Minor changes:
- Backward compatibility fixes/updates to support R 3.4.
Minor changes:
- Bug fix release. Re-importing rio package to ensure
export()
always works ondata.frame
method.
Minor changes:
import
: Added GRP file support for GSEA.import
: Improved Google Sheet import support, using new googlesheets4 package from tidyverse.- Improved code coverage, getting closer to 100%.
New functions:
- Migrated
sanitizePercent
here from basejump package.
Major changes:
- Switched to using "acid" prefix instead of "basejump" for global
options
. This applies in particular to theloadData
andsaveData
functions, where thedir
argument can be set globally for an interactive session using this parameter.
New functions:
- Migrated
removeNA
andsanitizeNA
from basejump here, so these functions can be imported in freerange package.
Minor changes:
- Migrated code to Acid Genomics.
Minor changes:
localOrRemoteFile
: Improved error message for Windows users when tempfile can't be removed successfully. This can happen on systems when the user is not running as Administrator, but doesn't happen on macOS or Linux.
Minor changes:
import
: Added initialrownames
andcolnames
parameter support fordata.frame
import. I strongly recommend leaving these enabled by default. However, these are useful in some edge cases when loading data from remote servers (e.g. WormBase).export
: Removed...
passthrough fordata.frame
method.
Minor changes:
import
: Switched to usinggdata::read.xls
to import legacy XLS binary files.readxl::read_excel
doesn't work consistently for some files and returnslibxls
error. In the meantime, use gdata package, which is slow but works.- Removed
validObject
check inimport
call.
Minor changes:
localOrRemoteFile
: Binary file extension pattern matching bug fix. Applies to files downloaded on Windows. Ifdownload.file
is not called with modewb
(write binary) for files on Windows, decompression will fail.
Major changes:
loadData
andsaveData
now supportlist
argument, supporting a character vector of object names. This allows for programmatic use of the functions with standard evaluation. For reference, this approach is inspired by the method defined insave
.
Minor changes:
import
: Data provenance metadata is now slotted intoattributes
for S3 return (e.g.data.frame
) andmetadata
for S4 return (e.g.DataFrame
).
Minor changes:
export
: Tightened up the method support, removing theANY
method. Now we're explicitly exportingdata.frame
,DataFrame
,matrix
, andGRanges
methods. This also has the added benefit of making the documentation more readable.
New functions:
fileExt
: An improved variation ontools::file_ext
.
Minor changes:
basenameSansExt
: Tightened up this function to returnNA
on match failure. This behaves simiarly to the newfileExt
function.- Switching back from defunct to deprecated for
sanitizeColData
,sanitizeRowData
,sanitizeRowRanges
, since these functions are still in use by bcbioRNASeq v0.2.9. - Improved NEWS file for previous releases.
Minor changes:
- Documentation fixes and website improvements.
This release helps ensure backward compatibility with R 3.4.
Minor changes:
- Bug fix for assert in
transmit
: Need to wrapisMatchingRegex
inall
for backward compatibility with R 3.4. - Miscellaneous CI fixes to Travis CI and AppVeyor CI.
New functions:
These functions have been migrated from basejump here to brio, since they deal specifically with file input/output:
atomize
.decode
.encode
.factorize
.
Removed functions:
- Removed
sanitizeColData
andsanitizeRowData
. May need to add back in a future release, but removed for time being.
Minor changes:
realpath
: Removed unnecessary assert check usingallHaveAccess
, sincenormalizePath
already checks for this whenmustWork = TRUE
.- Added initial code coverage support using testthat.
Minor changes:
- Updated imports to reflect renaming of S4Transformer package to simply transformer.
Minor changes:
- Improved
extPattern
to inheritcompressExtPattern
. - Improved documentation in
import
regarding compressed file handling. - Split out NAMESPACE imports into
imports.R
file. - Updated
sanitizeColData
andsanitizeRowData
to take advantage of exportedatomize
function. - Improved Travis CI and AppVeyor CI configuration.
Minor changes:
- Added Travis CI and AppVeyor CI support.
- Improved documention in
import
regarding supported file formats. - Split out internal importers into separate R files.
- Fixed return value for
pasteURL
. - Disabled working examples for transmit, since they are failing on Travis CI.
- Documentation fixes and miscellaneous tweaks to pass build checks.
Initial release. Migrated input-output (IO) functions from basejump.