Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
667724a
Optimize VDS operations with r-tree
mattjala Sep 16, 2025
8d10f8c
Changes for reworked tree interface
mattjala Sep 17, 2025
af9e0ff
Clean up search results on failure
mattjala Sep 18, 2025
9d14f09
Committing clang-format changes
github-actions[bot] Sep 18, 2025
cda3af8
Move release note
mattjala Sep 18, 2025
7a8e363
Add threshold for r-tree usage
mattjala Sep 22, 2025
217be34
Add Fortran API test
mattjala Sep 23, 2025
9a84834
Delay DAPL check
mattjala Sep 23, 2025
391b001
Convert error check to assert
mattjala Sep 23, 2025
c37bad5
Convert should_insert helper to macro
mattjala Sep 23, 2025
6041439
Convert record to void pointer
mattjala Sep 23, 2025
2487776
Update results handling
mattjala Sep 23, 2025
e59e47e
Rework is_in_tree to not_in_tree_list
mattjala Sep 23, 2025
1de7b6e
Update results handling
mattjala Sep 23, 2025
f6eef67
Add bullet point to CHANGELOG executive summary
mattjala Sep 25, 2025
eab9a6f
Move tree creation higher up
mattjala Sep 26, 2025
f59f054
Limit number of H5D__virtual_read_one calls
mattjala Sep 26, 2025
e2dfe89
Limit number of operations in post_io
mattjala Sep 26, 2025
186c7b5
Limit number of write calls
mattjala Sep 26, 2025
5688368
Add TODO notes
mattjala Sep 26, 2025
9f809da
Clang-format
mattjala Sep 26, 2025
2805ba8
Add rtree R/W test
mattjala Sep 29, 2025
391b6e8
Factor out test check to helper
mattjala Sep 29, 2025
e6d689a
Add write init test cases
mattjala Sep 29, 2025
1df23b5
Remove extent check for src dataspaces
mattjala Sep 29, 2025
5727f41
Cleanup
mattjala Sep 30, 2025
d5dd171
Expand java documentation
mattjala Sep 30, 2025
c8abd88
Streamline leaf validity check
mattjala Sep 30, 2025
8b3dcb0
Optimize allocation in layout copy
mattjala Sep 30, 2025
e0a7fd4
Add func description for H5D__virtual_pre_io_process_mapping
mattjala Sep 30, 2025
93d64f2
Add function end comments
mattjala Sep 30, 2025
358f39c
Grow not_in_tree list allocation by powers of 2
mattjala Sep 30, 2025
debe751
Add file end newline
mattjala Oct 1, 2025
1f8070e
Expand Fortran documentation
mattjala Oct 1, 2025
d1450f8
Update C documentation
mattjala Oct 1, 2025
039c90c
Rename DAPL functions
mattjala Oct 2, 2025
0f72161
Remove un-needed defines
mattjala Oct 2, 2025
ca2c3fa
Committing clang-format changes
github-actions[bot] Oct 2, 2025
8a45e21
Correct un-renamed functions in java/fortran wrappers
mattjala Oct 2, 2025
1a4ed86
Remove git conflict markers from CHANGELOG
mattjala Oct 3, 2025
b1b43be
H5Pset/get_virtual_dset_use_spatial_tree() -> H5Pset/get_virtual_spat…
mattjala Oct 3, 2025
cd374d9
Committing clang-format changes
github-actions[bot] Oct 3, 2025
86624f9
Attempt to close all mapping dataspaces even during failure
mattjala Oct 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions doxygen/examples/tables/propertyLists.dox
Original file line number Diff line number Diff line change
Expand Up @@ -616,6 +616,10 @@ encoding for object names.</td>
<td>#H5Pset_virtual_view/#H5Pget_virtual_view</td>
<td>Sets/gets the view of the virtual dataset (VDS) to include or exclude missing mapped elements.</td>
</tr>
<tr>
<td>#H5Pset_virtual_spatial_tree/#H5Pget_virtual_spatial_tree</td>
<td>Sets/gets the flag to use spatial trees when searching many VDS mappings</td>
</tr>
</table>
//! [dapl_table]
*
Expand Down
90 changes: 90 additions & 0 deletions fortran/src/H5Pff.F90
Original file line number Diff line number Diff line change
Expand Up @@ -4365,6 +4365,96 @@ END FUNCTION h5pget_chunk_cache_c

END SUBROUTINE h5pget_chunk_cache_f

!>
!! \ingroup FH5P
!!
!! \brief Retrieves the flag for whether to use/not use a spatial tree
!! during mapping operations on a Virtual Dataset. The default value is true.
!!
!! Use of a spatial tree will accelerate the process of searching through mappings
!! to determine which contain intersections with the user's selection region.
!! With the tree disabled, all mappings will simply be iterated through and
!! checked directly.
!!
!! Certain workflows may find that tree creation overhead outweighs the time saved
!! on reads. In this case, disabling this property will lead to a performance improvement,
!! though it is expected that almost all cases will benefit from the tree on net.
!!
!! \param dapl_id Target dataset access property list identifier.
!! \param use_tree Value of the setting.
!! \param hdferr \fortran_error
!!
!! See C API: @ref H5Pget_virtual_spatial_tree()
!!
SUBROUTINE h5pget_virtual_spatial_tree_f(dapl_id, use_tree, hdferr)
IMPLICIT NONE
INTEGER(HID_T) , INTENT(IN) :: dapl_id
LOGICAL , INTENT(OUT) :: use_tree
INTEGER , INTENT(OUT) :: hdferr
LOGICAL(C_BOOL) :: c_use_tree

INTERFACE
INTEGER(C_INT) FUNCTION H5Pget_virtual_spatial_tree_c(dapl_id, use_tree) &
BIND(C, NAME='H5Pget_virtual_spatial_tree')
IMPORT :: C_INT, HID_T, C_BOOL
IMPLICIT NONE
INTEGER(HID_T), INTENT(IN), VALUE :: dapl_id
LOGICAL(C_BOOL), INTENT(OUT) :: use_tree
END FUNCTION H5Pget_virtual_spatial_tree_c
END INTERFACE

hdferr = INT(H5Pget_virtual_spatial_tree_c(dapl_id, c_use_tree))

! Transfer value of C C_BOOL type to Fortran LOGICAL
use_tree = c_use_tree

END SUBROUTINE h5pget_virtual_spatial_tree_f

!>
!! \ingroup FH5P
!!
!! \brief Sets the dapl to use/not use a spatial tree
!! during mapping operations on a Virtual Dataset. The default value is true.
!!
!! Use of a spatial tree will accelerate the process of searching through mappings
!! to determine which contain intersections with the user's selection region.
!! With the tree disabled, all mappings will simply be iterated through and
!! checked directly.
!!
!! Certain workflows may find that tree creation overhead outweighs the time saved
!! on reads. In this case, disabling this property will lead to a performance improvement,
!! though it is expected that almost all cases will benefit from the tree on net.
!!
!! \param dapl_id Target dataset access property list identifier.
!! \param use_tree Value of the setting.
!! \param hdferr \fortran_error
!!
!! See C API: @ref H5Pset_virtual_spatial_tree()
!!
SUBROUTINE h5pset_virtual_spatial_tree_f(dapl_id, use_tree, hdferr)
IMPLICIT NONE
INTEGER(HID_T) , INTENT(IN) :: dapl_id
LOGICAL , INTENT(IN) :: use_tree
INTEGER , INTENT(OUT) :: hdferr
LOGICAL(C_BOOL) :: c_use_tree

INTERFACE
INTEGER FUNCTION h5pset_virtual_spatial_tree_c(dapl_id, use_tree) &
BIND(C, NAME='H5Pset_virtual_spatial_tree')
IMPORT :: HID_T, C_BOOL
IMPLICIT NONE
INTEGER(HID_T), INTENT(IN), VALUE :: dapl_id
LOGICAL(C_BOOL), INTENT(IN), VALUE :: use_tree
END FUNCTION h5pset_virtual_spatial_tree_c
END INTERFACE

! Transfer value of Fortran LOGICAL to C C_BOOL type
c_use_tree = use_tree

hdferr = INT(h5pset_virtual_spatial_tree_c(dapl_id, c_use_tree))

END SUBROUTINE h5pset_virtual_spatial_tree_f

#ifdef H5_DOXYGEN
!>
!! \ingroup FH5P
Expand Down
2 changes: 2 additions & 0 deletions fortran/src/hdf5_fortrandll.def.in
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,8 @@ H5P_mp_H5PGET_VIRTUAL_VSPACE_F
H5P_mp_H5PGET_VIRTUAL_SRCSPACE_F
H5P_mp_H5PGET_VIRTUAL_FILENAME_F
H5P_mp_H5PGET_VIRTUAL_DSETNAME_F
H5P_mp_H5PGET_VIRTUAL_SPATIAL_TREE_F
H5P_mp_H5PSET_VIRTUAL_SPATIAL_TREE_F
H5P_mp_H5PGET_DSET_NO_ATTRS_HINT_F
H5P_mp_H5PSET_DSET_NO_ATTRS_HINT_F
H5P_mp_H5PSET_VOL_F
Expand Down
35 changes: 35 additions & 0 deletions fortran/test/tH5P.F90
Original file line number Diff line number Diff line change
Expand Up @@ -777,8 +777,10 @@ SUBROUTINE test_misc_properties(total_error)
INTEGER, INTENT(INOUT) :: total_error

INTEGER(hid_t) :: fapl_id = -1 ! Local fapl
INTEGER(hid_t) :: dapl_id = -1 ! Local dapl
LOGICAL :: use_file_locking ! (H5Pset/get_file_locking_f)
LOGICAL :: ignore_disabled_locks ! (H5Pset/get_file_locking_f)
LOGICAL :: use_spatial_tree ! (H5Pset/get_dset_use_spatial_tree_f)
INTEGER :: error

! Create a default fapl
Expand Down Expand Up @@ -826,6 +828,39 @@ SUBROUTINE test_misc_properties(total_error)
CALL H5Pclose_f(fapl_id, error)
CALL check("H5Pclose_f", error, total_error)

! Create a dataset access property list
CALL H5Pcreate_f(H5P_DATASET_ACCESS_F, dapl_id, error)
CALL check("H5Pcreate_f", error, total_error)

! Test H5Pset/get_virtual_spatial_tree_f
! true value
use_spatial_tree = .TRUE.
CALL h5pset_virtual_spatial_tree_f(dapl_id, use_spatial_tree, error)
CALL check("h5pset_virtual_spatial_tree_f", error, total_error)
use_spatial_tree = .FALSE.
CALL h5pget_virtual_spatial_tree_f(dapl_id, use_spatial_tree, error)
CALL check("h5pget_virtual_spatial_tree_f", error, total_error)
if(use_spatial_tree .neqv. .TRUE.) then
total_error = total_error + 1
write(*,*) "Got wrong use_spatial_tree flag from h5pget_virtual_spatial_tree_f"
endif

! false value
use_spatial_tree = .FALSE.
CALL h5pset_virtual_spatial_tree_f(dapl_id, use_spatial_tree, error)
CALL check("h5pset_virtual_spatial_tree_f", error, total_error)
use_spatial_tree = .TRUE.
CALL h5pget_virtual_spatial_tree_f(dapl_id, use_spatial_tree, error)
CALL check("h5pget_virtual_spatial_tree_f", error, total_error)
if(use_spatial_tree .neqv. .FALSE.) then
total_error = total_error + 1
write(*,*) "Got wrong use_spatial_tree flag from h5pget_virtual_spatial_tree_f"
endif

! Close the dapl
CALL H5Pclose_f(dapl_id, error)
CALL check("H5Pclose_f", error, total_error)

END SUBROUTINE test_misc_properties

!-------------------------------------------------------------------------
Expand Down
53 changes: 53 additions & 0 deletions java/src/hdf/hdf5lib/H5.java
Original file line number Diff line number Diff line change
Expand Up @@ -10675,6 +10675,59 @@ public synchronized static native void H5Pset_virtual_prefix(long dapl_id, Strin
public synchronized static native void H5Pset_efile_prefix(long dapl_id, String prefix)
throws HDF5LibraryException, NullPointerException;

/**
* @ingroup JH5P
*
* H5Pget_virtual_spatial_tree accesses the flag for whether to use/not use a spatial tree
* during mapping operations on a Virtual Dataset. The default value is true.
*
* Use of a spatial tree will accelerate the process of searching through mappings
* to determine which contain intersections with the user's selection region.
* With the tree disabled, all mappings will simply be iterated through and
* checked directly.
*
* Certain workflows may find that tree creation overhead outweighs the time saved
* on reads. In this case, disabling this property will lead to a performance improvement,
* though it is expected that almost all cases will benefit from the tree on net.
*
* @param dapl_id
* IN: Dataset access property list
*
* @return true if the given dapl is set to use a spatial tree, false if not.
*
* @exception HDF5LibraryException
* Error from the HDF5 Library.
**/
public synchronized static native boolean H5Pget_virtual_spatial_tree(long dapl_id)
throws HDF5LibraryException;

/**
* @ingroup JH5P
*
* H5Pset_virtual_spatial_tree sets the dapl to use/not use a spatial tree
* during mapping operations on a Virtual Dataset. The default value is true.
*
* Use of a spatial tree will accelerate the process of searching through mappings
* to determine which contain intersections with the user's selection region.
* With the tree disabled, all mappings will simply be iterated through and
* checked directly.
*
* Certain workflows may find that tree creation overhead outweighs the time saved
* on reads. In this case, disabling this property will lead to a performance improvement,
* though it is expected that almost all cases will benefit from the tree on net.
*
* @param dapl_id
* IN: Dataset access property list
*
* @param use_tree
* IN: the use_tree flag setting
*
* @exception HDF5LibraryException
* Error from the HDF5 Library.
**/
public synchronized static native void H5Pset_virtual_spatial_tree(long dapl_id, boolean use_tree)
throws HDF5LibraryException;

// public synchronized static native void H5Pset_append_flush(long plist_id, int ndims, long[] boundary,
// H5D_append_cb func, H5D_append_t udata) throws HDF5LibraryException;

Expand Down
45 changes: 45 additions & 0 deletions java/src/jni/h5pDAPLImp.c
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,51 @@ H5D_append_cb(hid_t dataset_id, hsize_t *cur_dims, void *cb_data)
return (herr_t)status;
} /* end H5D_append_cb */

/*
* Class: hdf_hdf5lib_H5
* Method: H5Pset_virtual_spatial_tree
* Signature: (JZ)V
*/
JNIEXPORT void JNICALL
Java_hdf_hdf5lib_H5_H5Pset_1virtual_1spatial_1tree(JNIEnv *env, jclass clss, jlong dapl_id, jboolean use_tree)
{
bool use_tree_val;
herr_t retVal = FAIL;

UNUSED(clss);

use_tree_val = (JNI_TRUE == use_tree) ? true : false;

if ((retVal = H5Pset_virtual_spatial_tree((hid_t)dapl_id, (bool)use_tree_val)) < 0)
H5_LIBRARY_ERROR(ENVONLY);

done:
return;
} /* end Java_hdf_hdf5lib_H5_H5Pset_1virtual_1spatial_1tree */

/*
* Class: hdf_hdf5lib_H5
* Method: H5Pget_virtual_spatial_tree
* Signature: (J)Z
*/
JNIEXPORT jboolean JNICALL
Java_hdf_hdf5lib_H5_H5Pget_1virtual_1spatial_1tree(JNIEnv *env, jclass clss, jlong dapl_id)
{
bool use_tree = false;
jboolean bval = JNI_FALSE;

UNUSED(clss);

if (H5Pget_virtual_spatial_tree((hid_t)dapl_id, (bool *)&use_tree) < 0)
H5_LIBRARY_ERROR(ENVONLY);

if (use_tree == true)
bval = JNI_TRUE;

done:
return bval;
} /* end Java_hdf_hdf5lib_H5_H5Pget_1virtual_1spatial_1tree */

#ifdef __cplusplus
} /* end extern "C" */
#endif /* __cplusplus */
14 changes: 14 additions & 0 deletions java/src/jni/h5pDAPLImp.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,20 @@ JNIEXPORT void JNICALL Java_hdf_hdf5lib_H5_H5Pset_1virtual_1printf_1gap(JNIEnv *
*/
JNIEXPORT jlong JNICALL Java_hdf_hdf5lib_H5_H5Pget_1virtual_1printf_1gap(JNIEnv *, jclass, jlong);

/*
* Class: hdf_hdf5lib_H5
* Method: H5Pset_virtual_spatial_tree
* Signature: (JZ)V
*/
JNIEXPORT void JNICALL Java_hdf_hdf5lib_H5_H5Pset_1virtual_1spatial_1tree(JNIEnv *, jclass, jlong, jboolean);

/*
* Class: hdf_hdf5lib_H5
* Method: H5Pget_virtual_spatial_tree
* Signature: (J)Z
*/
JNIEXPORT jboolean JNICALL Java_hdf_hdf5lib_H5_H5Pget_1virtual_1spatial_1tree(JNIEnv *, jclass, jlong);

#ifdef __cplusplus
} /* end extern "C" */
#endif /* __cplusplus */
Expand Down
21 changes: 21 additions & 0 deletions release_docs/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ For releases prior to version 2.0.0, please see the release.txt file and for mor

## Performance Enhancements:

- Up to [2500% faster](https://github.com/HDFGroup/hdf5/blob/develop/release_docs/CHANGELOG.md#rtree) Virtual Dataset read/write operations
- [30% faster opening](https://github.com/HDFGroup/hdf5/blob/develop/release_docs/CHANGELOG.md#layoutcopydelay) and [25% faster closing](https://github.com/HDFGroup/hdf5/blob/develop/release_docs/CHANGELOG.md#fileformat) of virtual datasets.
- [Reduced memory overhead](https://github.com/HDFGroup/hdf5/blob/develop/release_docs/CHANGELOG.md#fileformat) via shared name strings and optimized spatial search algorithms for virtual datasets.

Expand Down Expand Up @@ -461,6 +462,26 @@ Simple example programs showing how to use complex number datatypes have been ad

This layout copy is now delayed until either a user requests the DCPL, or until the start of an operation that needs to read the layout from the DCPL.

### Virtual datasets now use a spatial tree to optimize searches<a name="rtree"></a>

Virtual dataset operations with many (>1,000) mappings were much slower than
corresponding operations on normal datasets. This was due to the need
to iterate through every source dataset's dataspace and check for an intersection
with the user-selected region for a read/write in the virtual dataset.

Virtual datasets with many mappings now use an r-tree (defined in H5RT.c) to
perform a spatial search. This allows the dataspaces that intersect the
user-selection to be computed with, in most cases, much fewer intersection checks,
improving the speed of VDS read/write operations.

Virtual datasets will use the r-tree by default, since the majority of use cases,
should see improvements from use of the tree. However, because some workflows may
find that the overhead of the tree outweighs the time saved on searches, there is
a new Dataset Access Property List (DAPL) property to control use of the spatial tree.

This property can be set or queried with the new API functions
H5Pset_virtual_spatial_tree()/H5Pget_virtual_spatial_tree().

## Parallel Library

### Added H5FDsubfiling_get_file_mapping() API function for subfiling VFD
Expand Down
4 changes: 4 additions & 0 deletions src/H5Dprivate.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
#define H5D_ACS_VDS_PREFIX_NAME "vds_prefix" /* VDS file prefix */
#define H5D_ACS_APPEND_FLUSH_NAME "append_flush" /* Append flush actions */
#define H5D_ACS_EFILE_PREFIX_NAME "external file prefix" /* External file prefix */
#define H5D_ACS_USE_TREE_NAME "tree" /* Whether to use spatial tree */

/* ======== Data transfer properties ======== */
#define H5D_XFER_MAX_TEMP_BUF_NAME "max_temp_buf" /* Maximum temp buffer size */
Expand Down Expand Up @@ -124,6 +125,9 @@
/* Default virtual dataset list size */
#define H5D_VIRTUAL_DEF_LIST_SIZE 8

/* Threshold for use of a tree for VDS mappings */
#define H5D_VIRTUAL_TREE_THRESHOLD 50

#ifdef H5D_MODULE
#define H5D_OBJ_ID(D) (((H5D_obj_create_t *)(D))->dcpl_id)
#else /* H5D_MODULE */
Expand Down
Loading
Loading