h5lite is an R package that provides a simple and lightweight interface for reading and writing HDF5 files.
It is designed for R users who want to save and load common R objects (vectors, factors, matrices, data.frames, lists, and NULLs) to an HDF5 file without needing to understand the low-level details of the HDF5 library.
h5lite is "opinionated" software that prioritizes simplicity and safety for the most common use cases. It handles the tricky parts of HDF5 automatically so you can focus on your data.
- Simple, R-native API: Use familiar functions like
h5_read()andh5_write(). No need to learn a complex new syntax. - "It Just Works" Philosophy:
- Dimensions are handled automatically. R matrices are saved as matrices and read back as matrices.
- Parent groups are created as needed (e.g., writing to
"group/data"works even if"group"doesn't exist). - Writing to an existing dataset path overwrites it, just like re-assigning a variable.
- Safe and Efficient:
- Numeric data is read back as
doubleto prevent integer overflow surprises. - Automatic data type selection saves space by default (e.g.,
1:100is stored as an 8-bit integer). - Built-in, easy-to-use compression (
compress = TRUE).
- Numeric data is read back as
- Easy Installation:
h5litebundles its HDF5 dependency, so installation is a simpleinstall.packages("h5lite"). No need to manage system libraries.
This package uses the HDF5 library developed by The HDF Group (https://www.hdfgroup.org/).
You can install the released version of h5lite from CRAN:
install.packages("h5lite")Alternatively, you can install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("cmmr/h5lite")The API is designed to be simple and predictable.
library(h5lite)
file <- tempfile(fileext = ".h5")Use h5_write() to save R objects. It automatically handles dimensions and chooses an efficient on-disk data type.
# Write a vector
h5_write(file, "data/vector", 1:10)
# Write an R matrix
mat <- matrix(c(1.1, 2.2, 3.3, 4.4), nrow = 2, ncol = 2)
h5_write(file, "data/matrix", mat)
# Write a 3D array
arr <- array(1L:24L, dim = c(2, 3, 4))
h5_write(file, "data/array", arr)
# Write a scalar
h5_write(file, "scalar_string", I("Hello!"))
# Write a factor
fac <- as.factor(c("a", "b", "a", "c"))
h5_write(file, "factor_data", fac)
# Write a data.frame
h5_write(file, "mtcars", mtcars)Use h5_ls() to see the file structure.
# List all objects recursively
h5_ls(file, recursive = TRUE)
#> "mtcars" "data" "data/array" "data/matrix"
#> "data/vector" "factor_data" "scalar_string"
# List only the top level
h5_ls(file, recursive = FALSE)
#> "mtcars" "data" "factor_data" "scalar_string"Use h5_read() to read data back into R. The function automatically restores the correct dimensions.
# Read the matrix
mat_in <- h5_read(file, "data/matrix")
print(mat_in)
#> [,1] [,2]
#> [1,] 1.1 3.3
#> [2,] 2.2 4.4
# Verify dimensions
print(dim(mat_in))
#> 2 2
all.equal(mat, mat_in)
#> TRUEYou can easily read and write metadata using attributes.
# Write attributes to the "data/matrix" dataset
h5_write_attr(file, "data/matrix", "units", I("meters"))
h5_write_attr(file, "data/matrix", "scale", c(1.0, 1.0))
# List attributes
h5_ls_attr(file, "data/matrix")
#> "scale" "units"
# Read an attribute
units <- h5_read_attr(file, "data/matrix", "units")
print(units)
#> "meters"For more detailed guides on specific topics, see the package vignettes:
- Getting Started with h5lite: A general introduction.
- Working with Atomic Vectors: Details on vectors and scalars.
- Working with Matrices and Arrays: Handling multi-dimensional data.
- Working with Data Frames: Using compound datasets.
- Data Organization: Using groups and lists to structure files.
- Attributes In-Depth: A deep dive into metadata handling.
- Using h5lite with Parallel Processing: Guide for multi-threaded and multi-process access.
You can also access these vignettes from within R using browseVignettes("h5lite").
h5lite is intentionally simple. If you need advanced control over HDF5 features, you should use a more comprehensive package like rhdf5 (from Bioconductor) or hdf5r (from CRAN).
| Feature | h5lite |
rhdf5 / hdf5r |
|---|---|---|
| Primary Goal | Simplicity and safety for common R objects | Comprehensive control over all HDF5 features |
| API Style | Simple R functions (h5_read, h5_write) |
Wrappers for the complete HDF5 C-API |
| Ease of Use | High. Designed for R users. | Medium. Requires some HDF5 knowledge. |
| Installation | Easy. Bundled HDF5 library. | Can be complex (hdf5r requires system library). |
| Dimension Order | Automatic. Transposes for you. | Manual. User must manage C vs. R order. |
| Numeric Safety | Safe. Reads numbers as double. |
User's choice. Can read as integers (risk of overflow). |
| Object Overwrite | Automatic. | Manual. Requires check/delete first. |
| Compression | Simple on/off (compress = TRUE). |
Full control over filters, chunking, etc. |
Use rhdf5 or hdf5r if you need to:
- Work with complex or custom HDF5 data types not supported by
h5lite(e.g., bitfields, references). - Have fine-grained control over file properties, chunking, or compression filters.
- Perform partial I/O (i.e., read or write a small slice of a very large on-disk dataset).
Use h5lite if you want to:
- Quickly and safely get data into or out of a file.
- Avoid thinking about low-level details.
