Skip to content

cmmr/h5lite

Repository files navigation

h5lite h5lite logo

covr

h5lite is an R package that provides a simple and lightweight interface for reading and writing HDF5 files.

It is designed for R users who want to save and load common R objects (vectors, factors, matrices, data.frames, lists, and NULLs) to an HDF5 file without needing to understand the low-level details of the HDF5 library.

Why use h5lite?

h5lite is "opinionated" software that prioritizes simplicity and safety for the most common use cases. It handles the tricky parts of HDF5 automatically so you can focus on your data.

  • Simple, R-native API: Use familiar functions like h5_read() and h5_write(). No need to learn a complex new syntax.
  • "It Just Works" Philosophy:
    • Dimensions are handled automatically. R matrices are saved as matrices and read back as matrices.
    • Parent groups are created as needed (e.g., writing to "group/data" works even if "group" doesn't exist).
    • Writing to an existing dataset path overwrites it, just like re-assigning a variable.
  • Safe and Efficient:
    • Numeric data is read back as double to prevent integer overflow surprises.
    • Automatic data type selection saves space by default (e.g., 1:100 is stored as an 8-bit integer).
    • Built-in, easy-to-use compression (compress = TRUE).
  • Easy Installation: h5lite bundles its HDF5 dependency, so installation is a simple install.packages("h5lite"). No need to manage system libraries.

This package uses the HDF5 library developed by The HDF Group (https://www.hdfgroup.org/).

Installation

You can install the released version of h5lite from CRAN:

install.packages("h5lite")

Alternatively, you can install the development version from GitHub:

# install.packages("devtools")  
devtools::install_github("cmmr/h5lite")

Quick Start

The API is designed to be simple and predictable.

library(h5lite)

file <- tempfile(fileext = ".h5")

1. Write Data

Use h5_write() to save R objects. It automatically handles dimensions and chooses an efficient on-disk data type.

# Write a vector
h5_write(file, "data/vector", 1:10)

# Write an R matrix
mat <- matrix(c(1.1, 2.2, 3.3, 4.4), nrow = 2, ncol = 2)
h5_write(file, "data/matrix", mat)

# Write a 3D array
arr <- array(1L:24L, dim = c(2, 3, 4))
h5_write(file, "data/array", arr)

# Write a scalar
h5_write(file, "scalar_string", I("Hello!"))

# Write a factor
fac <- as.factor(c("a", "b", "a", "c"))
h5_write(file, "factor_data", fac)

# Write a data.frame
h5_write(file, "mtcars", mtcars)

2. List Contents

Use h5_ls() to see the file structure.

# List all objects recursively
h5_ls(file, recursive = TRUE)
#> "mtcars"       "data"         "data/array"      "data/matrix"    
#> "data/vector"  "factor_data"  "scalar_string"  

# List only the top level
h5_ls(file, recursive = FALSE)
#> "mtcars"  "data"  "factor_data"  "scalar_string"

3. Read Data

Use h5_read() to read data back into R. The function automatically restores the correct dimensions.

# Read the matrix
mat_in <- h5_read(file, "data/matrix")
print(mat_in)
#>      [,1] [,2]
#> [1,]  1.1  3.3
#> [2,]  2.2  4.4

# Verify dimensions
print(dim(mat_in))
#> 2 2

all.equal(mat, mat_in)
#> TRUE

4. Attributes

You can easily read and write metadata using attributes.

# Write attributes to the "data/matrix" dataset
h5_write_attr(file, "data/matrix", "units", I("meters"))
h5_write_attr(file, "data/matrix", "scale", c(1.0, 1.0))

# List attributes
h5_ls_attr(file, "data/matrix")
#> "scale" "units"

# Read an attribute
units <- h5_read_attr(file, "data/matrix", "units")
print(units)
#> "meters"

See Also

For more detailed guides on specific topics, see the package vignettes:

You can also access these vignettes from within R using browseVignettes("h5lite").

When to Use Another HDF5 Package

h5lite is intentionally simple. If you need advanced control over HDF5 features, you should use a more comprehensive package like rhdf5 (from Bioconductor) or hdf5r (from CRAN).

Feature h5lite rhdf5 / hdf5r
Primary Goal Simplicity and safety for common R objects Comprehensive control over all HDF5 features
API Style Simple R functions (h5_read, h5_write) Wrappers for the complete HDF5 C-API
Ease of Use High. Designed for R users. Medium. Requires some HDF5 knowledge.
Installation Easy. Bundled HDF5 library. Can be complex (hdf5r requires system library).
Dimension Order Automatic. Transposes for you. Manual. User must manage C vs. R order.
Numeric Safety Safe. Reads numbers as double. User's choice. Can read as integers (risk of overflow).
Object Overwrite Automatic. Manual. Requires check/delete first.
Compression Simple on/off (compress = TRUE). Full control over filters, chunking, etc.

Use rhdf5 or hdf5r if you need to:

  • Work with complex or custom HDF5 data types not supported by h5lite (e.g., bitfields, references).
  • Have fine-grained control over file properties, chunking, or compression filters.
  • Perform partial I/O (i.e., read or write a small slice of a very large on-disk dataset).

Use h5lite if you want to:

  • Quickly and safely get data into or out of a file.
  • Avoid thinking about low-level details.

About

Read/write HDF5 datasets and attributes from R

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published