-
-
Notifications
You must be signed in to change notification settings - Fork 270
HDF5 Working Group
This meeting series has been temporarily cancelled. Please watch the forum for details on the new schedule.
The HDF5 Working Group meets weekly, on Thursdays at 10 am Central time. This meeting is for HDF5 library developers and anyone is welcome to attend. The purpose of this meeting is to discuss HDF5 library development. It is NOT intended for providing technical support.
Zoom link: https://us06web.zoom.us/j/89601195963
The agenda and any cancellations will be posted below and on the forum, usually on Monday. If you have any action items to discuss, please email derobins at hdfgroup dot org to get them added to the agenda. If there are no pressing issues by Monday, the meeting may be cancelled.
Review threadsafe H5FL package changes and testing (Quincey)
- No meeting 26 December
- Threadsafe object locking
- Hosted by Neil Fortner, Dana is at AGU24 this week.
- Please feel free to add your agenda items here!
- Multithreaded collective I/O operations (Quincey)
- HDF5 2.0 goings on
- Autotools delete pending
- The C++ wrappers will survive for another day (useful for RAII)
- zlib default behavior?
- On or off?
- Fail or not?
- Do we need H5Eprint3()? GH #4698 (https://github.com/HDFGroup/hdf5/issues/4698)
- Discuss metadata performance issues:
- Metadata cache config defaults
- Maximum metadata cache size
- Maximum "est_num_entries" size
- Cache image disallowed in read only mode
- Current behavior:
- Default ON
- Not a configure failure if not found
- If you rely on the default ON, and the zlib isn't found, you will create an HDF5 library with no compression
- If we make the default OFF, people who rely on the defaults will create HDF5 libraries with no compression
- Decisions:
- CMake will fail if a selected option cannot be built
- We'll leave zlib ON and try to do the best we can on platforms like Windows
- Nobody complained, we'll version
- 100%
- Could be different in Windows/POSIX
- Still issues with Windows Debug/Release
- Problem with max number of external links in EFC (16-bit limit?)
- Library complains when more than 64k, Werner has 140k
- HDF5 should not arbitrarily have limits, even if high values will typically make a file system sad
- Should change the library MD cache size limits
- Definitely should try paged aggregation and page buffering
- Are there crashes when you set the default cache size to the max?
- John said there may be issues with going more than 32 MiB (max is 64 MiB?)
- Should bump MD cache sizes (memory size, element counts)
- John suggests that we may want to make the cache more aggressive about adapting to larger size working sets
No Meeting - Happy Thanksgiving! 🦃
This session is cancelled due to staff and community members attending SC24.
- Sparse data API changes (RFC link: https://github.com/LifeboatLLC/SparseHDF5/blob/main/design_docs/RFC-HDF5-Model-API-Sparse-2024-10-23.docx)
- Discuss metadata performance issues:
- Metadata cache config defaults
- Maximum metadata cache size
- Maximum "est_num_entries" size
- Cache image disallowed in read only mode
- HDF5 2.0 goings on (recent merges, etc.)
- HDF5 2.0 planning (wiki, parent/child issues)
- PR #5015 (concurrency feature)
- Send any feedback about 2.0 decisions ASAP
- We turned on a few branch protection rules:
- Dismiss stale pull request approvals when new commits are pushed
- Require branches to be up to date before merging
- We'll turn this branch protection rule on soon:
- Require signed commits
- Adds support for future concurrency
- Everyone should review this so we can merge it before the next meeting
- Note: CMake has trouble with C11 threads (see #5034)
- HDF5 2.0 goings on (recent merges, etc.)
- HDF5 2.0 planning (wiki, parent/child issues)
- Sparse data API changes (RFC link: https://github.com/LifeboatLLC/SparseHDF5/blob/main/design_docs/RFC-HDF5-Model-API-Sparse-2024-10-23.docx)
- PR #5015
- Version numbers, branch names, etc. (https://github.com/HDFGroup/hdf5/wiki/HDF5-Version-Numbers-and-Branch-Strategy)
- Sparse data API changes (RFC link: https://github.com/LifeboatLLC/SparseHDF5/blob/main/design_docs/RFC-HDF5-Model-API-Sparse-2024-10-23.docx)
- General agreement that the
- Quincey suggests:
- That we separate sparsity from the structured chunk implementation so other VOL connectors can do their own special sparse thing. Proposed an H5Pset_density() API call.
- H5Dget_defined() should also work for any dataset (e.g., think of missing chunks)
- Ditto H5Derase()
- Native VOL connector API calls should be namespaced (e.g., H5Dnative_foo())
- HDF5 1.16 --> HDF5 2.0
- Go over this PR: https://github.com/HDFGroup/hdf5/pull/4942
- The direct VFD may not be SWMR-compatible due to direct I/O page machinations under the hood
- Everyone should look over the sparse RFC for next week
- Especially think about namespacing native VOL connector API calls
- C++ wrappers, in general, would NOT be thread-safe (either ours or HighFive, etc.)
- Elena says that Fortran should be thread-safe
- General agreement with the slides shown at Oct 15 CtD
- Surprisingly little pushback on removing the Autotools
- Should check with the BlueBrain folks about C++ header implementation support
- We can put HighFive in our CI
- Might take it over and seek funding if it turns into a community project
- Need to clearly describe what our new versioning scheme means for the file format
Go over this PR: https://github.com/HDFGroup/hdf5/pull/4942
- Changes a lot of heap allocations to stack allocations
- Cleanup fixes several problems
- Pulls H5FL out of the startup code, handy for initialization
- Fixes a deficiency in the VOL API where context pointers was missing
10 October 2024 - Plugin working group
- 1.14.5 release / DMG issues
- The 1.16.0 branch has been created
- VOL cleanup PR - https://github.com/HDFGroup/hdf5/pull/4856
- Tentative VFD initialization changes
- Do we still support 32-bit operating systems
VOL cleanup PR - https://github.com/HDFGroup/hdf5/pull/4856
- This should be ready to merge now
- Discussed pro/con of init-at-once v. lazy, refactoring, other changes (qkoziol/refactor_h5fd_and_packages)
- New private files for internal things in "demo" VFDs that use public API calls only
- Definitely needs performance testing
- WIP - will revisit when ready for a PR
Dana is out at NOBUGS 2024 this week so Neil Fortner will host this meeting.
- 1.14.5 is available for testing. See forum post.
- 1.14.5 is available for testing. See forum post.
- Quincey Koziol and Matt Larson: Thread safety shut down issues: potentially active threads that interacted w/library and acquired resources of some sort and are potentially executing when library is attempting to shut down.
- what if thread is executing and has any file related resources (assume holding dirty metadata--lots of special cases there) if the process dies, we'll have a corrupted file. How to get the data to the file safely so it doesn't corrupt?
- Tell user: "don't do that."
- See documentation in develop: https://github.com/HDFGroup/hdf5/blob/develop/doc/threadsafety-warning.md
- Goes in 1.14.5, needs to be in release notes.
- Sparse chunk API - Elena and Quincey have discussion items. Need to firm up schedule. Neil will follow-up.
- File Format - Elena will have discussion items and will work with Dana to add to agenda.
- Plug-in Working Group meeting on 3rd Thursday - October 17
This was originally intended as an HDF5 Plugin Working Group meeting, though there is only one action item
Agenda here: https://github.com/HDFGroup/hdf5_plugins/wiki/HDF5-Plugin-Working-Group
So we'll also discuss the 1.14.5 release
- Any show-stoppers for the release?
- Quincey wanted to discuss this PR: https://github.com/HDFGroup/hdf5/pull/4856
- Should we replace our ancient Perl scripts with Python?
- Not aware of any 1.14.5 issues
- There was a minor Windows issue and this was fixed (still needs to go into 1.14 and 1.14.5)
- ttsafe still has problems on slow machines, even with the latest patch
- Some minor issues, but nothing that would hold the release (will be a release note)
- PR #4856
- Some nice VOL cleanup, very reasonable, needs review
- Other PRs
- Neil's #4843 - adapts the external dataset code to use the sec2 VFD trick for dealing with torn I/O
- Python vs Perl
- This is fine if someone wants to donate their time
- It'd be nice to also replace the shell scripts with Python (assuming we keep the Autotools)
- Misc issues
- NVHPC has dt_arith problems with long doubles (Quincey/Nvidia will investigate)
- Java tests may be encountering uncaught failures
- Quincey says he's seen problems with gcc 14 and MacOS
- 1.14.5 Release
- Docs are a mess. Need to fix.
- Discuss H5Tset_size() behavior in complex number PR
- Decision: Disable H5Tset_size() for array, complex, and variable-length datatypes.
- 1.14.5 release
- Structured chunk discussion (follow-up from 22 Aug meeting)
- H5Tset_size discussion
- Still targeting late September
- Any changes need to be in by Sep 13th (NEXT FRIDAY)
- Pre-release the following week - PLEASE TEST!
- There are some issues with MacOS signing w/ plugins, but these should be resolved before the release
- Allen says HDF4 dmg files work and people should test
- ttsafe failures are develop only and will not move to 1.14
- Elena says:
- Broken links
- Missing docs (subfiling, etc.)
- Don't bump B-tree, etc. versions (only messages change)
- There is no new layout. Storage is still chunked. Only need to revise the chunked storage.
- "Typo" in that final chunk dimension is datatype size (NEEDS TO BE DOCUMENTED)
- Chunked storage property needs a version number field
- Probably need to think more about the layout property
- Some discussion about locality of sparse chunk descriptions in the file format doc (Quincey argues it needs to be closer to the structured chunk property section)
- Much discussion about checksums, mandatory vs optional, metadata vs data, etc. - needs more discussion
- Next discussion on October 10th
- 1.14.5 release
- Structured chunk "working group"
- Improved docs/website
- oss-fuzz action discussion (#4784)
- CMake examples threads detection
- Mac OS dmg construction CI issue
- Still targeting late September
- Any changes need to be in by Sep 13th
- Pre-release the following week - PLEASE TEST!
- There are some issues with MacOS signing, but these should be resolved before the release
- ttsafe failures are develop only and will not move to 1.14
- Small working group will meet next week to discuss some ideas, will present findings during next week's HDF5 WG
- Should be ready to go today (please review #4718)
- Testing sanitizers in CI is a great idea
- Weird that we have known sanitizer errors, but the sanitizer checks pass
- Should not run oss-fuzz as a part of CI
- PR #4746
- Confusing logic
- Spirited discussion, but we should just remove this for now since there are no examples with threads
- Should also rename HDF_ variables to HDF5_EXAMPLES_ for clarity
- Probably due to concurrent detach of dmg volume & dmg action finishing up
- Can possibly replace dmg tool with a script that retries
- https://gitlab.kitware.com/cmake/cmake/-/issues/19517
- Anything critical for the 1.14.5 release?
- Introduce changes for structured chunks (Lifeboat LLC) https://gamma.hdfgroup.org/ftp/pub/outgoing/vchoi/SPARSE/H5.format.html#ChangesForStructChunk
- Still targeting late September
- Any changes need to be in by Sep 13th
- Pre-release the following week - PLEASE TEST!
- There are some issues with MacOS signing, but these should be resolved before the release
- Everyone should look at the structured chunk RFC, RM entries, file format changes, etc. (link above)
- We'll discuss in two weeks (Sept 5)
18 July 2024 - Plugin working group
This is a holiday in the US.
- Filter plugin working group
- Git submodules - okay or no? (Dana)
- A PEP-like process for library modifications (Dana)
- Safely initializing global variables in MT-HDF5 (Quincey)
- The 2nd HDF5 WG meeting of the month will be the Filter Plugin Working Group (no other business that meeting)
- Starts July 11th
- Will announce on the forum and my Call the Doctor session next week
- I'll email anyone who expressed interest at the last HUG (sadly, that session was not recorded)
- Dana will send a link to the original filter working group whitepaper from 10+ years ago
- Let's change the name to External Plugin Working Group and cover VOL connectors, VFDs, and filters
- PR #4604 would bring in Doxgygen Awesome and the recommended way to do this is via a submodule
- Submodules can be difficult for people new to them
- We probably wouldn't ever modify the code in the submodule so it'd be less of a burden
- The only thing we'd ever have to do is update the target to point to newer versions (and there might be actions that keep it in sync, like dependabot)
- Definitely need to update the docs and alert/educate developers
- (Jordan) CMake can pull things for you. Best for things that rarely change.
- DECISION: Let's just copy the files over. There are not a lot.
- PEP 1 describes the process (https://peps.python.org/pep-0001/)
- They also have a coding style PEP (https://peps.python.org/pep-0008/)
- They have a lot of infrastructure in their peps repo that we could probably modify (https://github.com/python/peps)
- I mainly want to use the PEP concept and infrastructure, but modify the governance to suit our project (so no intention to bring over unmodified)
- I want to announce this at the HUG, so we have a month to argue about specifics
- Aleksandar suggests we use Markdown instead of rst (there are also fancier versions of Markdown)
- We'll need to see how the PEP docs handle more complicated documents (is rst good enough?)
- John Mainzer suggests we talk to the PEP guys to see what works and what does not (Aleksandar has some experience here)
- Poisoning errors discussion (Nvidia)
- Wrapping user callbacks (Nvidia)
- NOTE: This is all tentative and work in progress
- Would allow assert-like behavior in production builds
- Would add "poisoned" checks to public API calls via the FUNC_ENTER macros
- Would prevent library state change when poisoned
- Would simply return the error value (with no error stack, since creating error stacks changes library state)
- Might want a mechanism to switch between abort() and returning an error
- HPOISON; macro would poison the library
- H5is_library_poisoned() call would return poison/not poisoned state w/o changing library state (just returns a global var)
- Is this really necessary?
- Might be useful when we can't rebuild the library
- No obvious driver for this
- Adds complexity
- It would be nice to see other examples of poisoning in a library or have pointers to other discussions
- Would probably be difficult to apply consistently in the library
- Needed when a callback can leave the library (not internal callbacks)
- Hard to differentiate when we are doing library vs external calls
- Adds a performance hit when we are doing internal calls
- e.g., VOL callbacks
- Add H5_BEFORE/AFTER_USER_CB() macros
- No-ops when not a concurrency-safe library
- Add H5TS_user_cb_prepare()
- ~150 places "callback points"
- ~300 places where we need the macros
- CMake discussion
- Poisoning discussion
- Should we have an HDF5 1.16.0 release in the fall?
- What do we do about the C++ wrappers?
See: https://forum.hdfgroup.org/t/community-input-hdf5-cmake-overhaul/12364/2
For HDF5 1.14.5, we’re planning on overhauling our CMake build code, to bring it more in line with modern CMake conventions.
We’re still putting together our assessment/plan, but we’d like to:
- Consolidate and reorganize the existing CMake code, to make things easier to understand and find
- Move to “modern CMake”
- Convert macros to functions
- Add additional functions to avoid code duplication
- Ensure parity with the Autotools
- Add verification/testing to CI to ensure builds are correct
- Add documentation
- A brief introduction to modern CMake: https://cliutils.gitlab.io/modern-cmake/
- And here's an example video about modern CMake, in case you prefer to listen/watch: https://www.youtube.com/watch?v=mn1ZnO3MtVk&t=12s&ab_channel=NDCConferences
- Adding a way to "poison" the library so subsequent API calls will just fail and not mutate state
- Helpful for debugging
- Could avoid making problems worse when library state gets corrupted
- Concurrency is the driving force for this
- Would be mid-way in impact between assert/abort and just returning an error code
- Could be implemented easily via a global "poisoned" variable that is checked on API entry
- Consensus is "maybe, but let's think about it"
- Should probably be off by default, at least in release builds
- We should set up an RFC/PIP process for HDF5 so things like this can be documented and discussed
- Complex number support requires bumping a datatype version number and that should happen in a major release
- Should we do a major release (probably 1.16.0) in the fall to get complex number support out?
- Tentative 'yes' and we'll start planning (more news at the June 3 Call the Doctor)
- Might bump some library defaults (e.g., to better work in the cloud)
- Could make minor API changes (e.g., #3505 HDoff_t change)
- Should think about bumping the minimum CMake version to get workflows (3.25?)
- Might also think about making some build system changes (drop obsolete things, rename for consistency)
- Probably not move to C11 to make the upgrade path easier, but maybe bump develop to C11 after creating the 1.16 branch
- Would try to make upgrading from 1.14.x as easy as possible
- Would no longer support 1.14.x (so 1.14.4 would be the last 1.14 release)
- These have lagged far behind the library, feature-wise
- Do we update them or deprecate and eventually remove them? There are header-only C++ wrappers that provide a more modern interface (not maintained by HDFG)
- We'll ask the community (forum posts, CtD, state-of-HDF5 talks)
- Clean out PR queue
NOTE: In general, PRs that will never be merged but contain useful ideas should be closed and an issue/discussion created instead. PRs that are a work in progress, are under active development, and should not be merged until important changes have been made should be marked draft while the work takes place. Everything else in the PR list should be ready for review and possible merging.
- 1387 - (H5T optimization) - This PR will never be merged and was being kept as a reminder to investigate the useful bits of the PR. We will close it and create an issue.
- 3505 - (HDoff_t) - This PR or an equivalent will be merged before June. It cannot go to 1.14 since it's an API change (which will be unversioned). Needs better docs and probably an issue to ensure docs are correct in next major release.
- 4171 - (NVHPC update) - Will remain open while we investigate why long double conversions are failing.
- 4266 - (uninit H5T memory) - Close and create an issue. Needs to be fundamentally fixed elsewhere.
- 4315 - (datatype precision overflow) - Leave open. Needs minor tweaks.
- 4347 - (vasprintf) - Leave open. Will be converted to use HD prefix.
- 4469 - (big threading PR) - Needs review. Copyright okay now.
- 4475 - (pause error checking) - Okay to review and merge. Needs an issue to update the docs, especially the VOL connector author guide. Also probably needs VOL connector upgrades. Might be a 2.0 thing.
- 4487 - (zlib-ng) - Can remain open while we investigate the test failure.
- 4488 - (URLs) - Can remain open as a draft but needs a plan for resolution.
- 4500 - (CMake UNITY_BUILD) - Review and merge.
- Make sure everyone can connect to Zoom
- Complex number datatype creation API
- Does it make sense to allow specifying member names for real and imaginary part of complex numbers and store that in the file?
- Should we design the API to allow different datatypes for real and imaginary part, based on forum discussion?
- When/if C11 is moved to, alignment checking of C types in library can be replaced with _Alignof(type) / alignof(type) (latter removed in C23)
- Additional locking discussion (Quincey Koziol)
- PRs and such
- Does it make sense to allow specifying member names for real and imaginary part of complex numbers and store that in the file?
- Use case is mainly display
- Could be contentious
- Might be better to push this to the tools
- Should we design the API to allow different datatypes for real and imaginary part, based on forum discussion?
- Is there a compelling use case for this?
- Jordan will push the feature to a branch soon
- Could make complex numbers a datatype class and select between cartesian/polar
- Alternatively, just support C complex type (like how the rest of the library basically works)
When/if C11 is moved to, alignment checking of C types in library can be replaced with _Alignof(type) / alignof(type) (latter removed in C23)
- It makes sense to allow this when we move to C11 (in next major version?)
- Switched to Zarr from HDF5 due to lack of SWMR
- LZ4 filter function issues - chunk can't be uncompressed directly
- Need to address MT, etc. concerns
- We should support FP8, which Zarr does not
- Sent along a doc (in chat)
- Move to Zoom (starting next week)
- Governance!
- PRs and such
- Let's chat about removing Autotools support
- Is it okay to allocate file space when there is a null dataspace (https://forum.hdfgroup.org/t/getting-error-invalid-dataset-size-likely-file-corruption/12246)?
- Discuss locking protocol for multithreaded concurrency
- Moving to Zoom so it's easier for outside people to join
- Agenda will be listed here and announced on the forum on Mondays
- I'll send a Zoom link to everyone who is on the existing Teams invite list
- Might use one of the meetings per month for the filter working group
- Will be looking at other projects to get some ideas for how to better manage the HDF5 community
- Do we check in generated files? (#4453, which commits newly-added, generated files) - Quincey suggests having a script to generate the files as a convenience
- Should the codestack removal go to 1.14? (#4454) - No strong feelings, definitely broken and nobody complained, just a configure change, let's remove
- Supporting two build systems is a lot of work
- libtool is a mess and requires sed hacks to fix linker options (#4448)
- How much screaming will there be if we drop the Autotools in the next major version of the library?
- What about Ubuntu, etc.?
- Look for complaints in the forum
- An oversight, nothing gets allocated at zero bytes
- Need to update the docs
- PRs and such
- Property lists (Lifeboat)
- FUNC_ENTER has some controversies and needs investigation
- Others can go in after review requirements are met
- Do we need to support modifying a property list class after properties exist? (yes)
- Do we need to support deleting properties from a class? (maybe)
- Existing library is buggy in that properties can be deleted from a property list class and this will affect existing property lists (John Mainzer will file an issue)
- Yay, release
- C11 in develop
- Issues, etc.
- Revisions to the error package (H5E)
- It's out. Yay!
- Is it okay to require C11 in develop?
- Tentative yes, but we need a policy
- Also need to figure out where we are going to test and debug big-endian code
- Not a lot going on
- Let's get the H5Tconv refactoring in
- NVidia thread-safety changes passing CI, PRs imminent
- Trying to reduce H5I usage from H5E
- H5E uses H5I for its data management, should revamp to store things in local data structures
- Legit use cases for creating custom ID things (e.g., VOL connectors)
- Will still use IDs, but only for public new classes that will be immediately converted to the internal things
- So the public API will not change
- Library internals should be refactored to avoid ID use internally, in general
- Lifeboat says this is fine wrt what they are doing
- Release goings on
- (Pseudo-)random numbers in HDF5
- What do we do with the last HDfoo() functions?
- Do we keep the 'getting started' guide and similar docs in the GitHub wiki or with the code?
- H5TRACE scheme removal
- H5E changes && "have threads" vs "have threadsafe" macros (Quincey Koziol)
- Release is TODAY 🎉
- See this PR: https://github.com/HDFGroup/hdf5/pull/4338
- Nobody had complaints
- See this PR: https://github.com/HDFGroup/hdf5/pull/4347
- We already demand C99, so those can be undecorated (Windows functions that take longs will be ifdef'd)
- POSIX will keep HD prefixes
- This PR: https://github.com/HDFGroup/hdf5/pull/4339
- Moved the 'getting started' guide here: https://github.com/HDFGroup/hdf5/wiki/Getting-Started-with-HDF5-Development
- Quincey says wiki is okay
- Nobody else had comments
- See this PR: https://github.com/HDFGroup/hdf5/pull/4341
- No complaints about removing H5TRACE
- Quincey wants to not incr/decr library error IDs
- Maybe cache error IDs in a table?
- Will require further investigation
-
H5_HAVE_THREADS
--> Has threads -
H5_HAVE_THREADSAFE
--> Has thread-safety (i.e., global lock)
- Release goings on
- Locking protocols discussion (Quincey Koziol)
- Release status
- #3505: off_t --> HDoff_t
- Quincey's threadsafety wrapper work
- Do we mandate C11? (at least in develop)
- PR/issue highlights
- Crashproofing RFC
- Delayed until early/mid-April (waiting on a security patch from Amazon)
- We did not version anything when we went from hid_t int --> int64_t in 1.8/10
- Let's merge this PR (can't go to 1.14)
- This is fine
- Quincey sees off_t issues on his Mac w/ gcc 13
- Generally supported in modern compilers
- Generics would be handy (mandatory)
- Threads would be handy (optional)
- Atomics would be handy (optional)
- Other things? Unicode?
- Tentative: yes, but we should think about the implications of this
- What about things that were mandatory in C99 but optional in C11?
- MSVC: yes (2022, some is preliminary) XLC: yes (LLVM-based 17.x) --> threads/atomics/generics
- https://devblogs.microsoft.com/cppblog/c11-threads-in-visual-studio-2022-version-17-8-preview-2/
- https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.0?topic=infrastructure-enhanced-language-standard-support
- https://clang.llvm.org/c_status.html
- https://gcc.gnu.org/wiki/C11Status
- Big set of changes from Quincey / AWS has been merged (thanks!)
- NOTE: These will be filed as Mitre CVEs
- https://github.com/qkoziol/hdf5/tree/threading_updates
- Nice improvements, cleans up a lot of subfiling's threading code
- Would be even nicer if we mandated C11 (see above)
- Will have a PR soon
- Will hold off on merging until 1.14.4 is finalized
- John Mainzer insists that VFD SWMR has many of these features
- Will send out the RFC to meeting attendees
- Will schedule a follow-up meeting to discuss
- Neil will present at next week's call the doctor meeting
- Upcoming release
- PR/issue highlights
- Try to get your code in by Friday (March 22)
- Release will be next Thursday (March 28)
- Issue #108 - Needs a file format bump, might want to boost to 64-bit chunks, could do as a part of sparse work
- PR #4166 - Jordan will close, we'll revisit for 1.14.5 (possibly a doc change)
- float16 goings on
- PR/issue highlights
- Will merge PR on Friday
- Not much exciting going on
- John may have H5I lock-free code for us to look at in April
- Mark can help us with HDF Java --> Maven in April
- Quincey talked about the need for H5I locks even with lock-free data structures
- float16 goings on
- The TRACE macros and related
- PR/issue highlights
- Going pretty well
- Dealing with quirky platforms/compilers
- Scot says there are some Fortran issues with Flang (REAL16 support is in Fortran 2023 - Flang is the only compiler that supports this (currenty))
- Discussion
- Thoughts?
- Jordan: Never found it to be useful at all
- Decision is to merge existing PR, then tag in GH, and then create a PR to yank
- (#4070 - BE examples fail) What do we do about examples?
- Jordan says h5ls and h5dump have opposite ideas about printing types (we should fix this)
- (#4064 - ros3 VFD secret length) Can go in. Dana will fix the stack allocations later.
- (#4062 - static CRT link) Can go in. Dana will update the PR to have a "don't use this!" comment
- (#4060 - long double test) Can go in
- PR #4017 (improves flush performance) <-- Click merge button if no objections
- float16 goings on
- Other library happenings
- PR #4017 (improves flush performance)
- Other library goings on
- Looks good at a first glance
- We'll discuss next week and merge after the HDF5WG
- FAPL changes will propagate to sub-files in things like VDS, external links
- Should change the behavior of H5Pset_elink_fapl() so you get the propagating behavior and only have to use the API call if you want something different
- ~20 project ideas regarding GDS, etc.
- Float16 RFC update
- Chat about chunk cache meeting
- Brief discussion of complex datatypes, will keep on working on the RFC
- Will kick this down the road until there are phase II funds (April?)
- Need to have ABI checks when merging to 1.14
- Need to be able to easily find ABI check results for develop, 1.14
- Are build-only checks running tests?
- Do the Autotools + gcc strip -g (or not pass it) to the compiler, linker, etc. ? Is -s getting passed down and superseding -g? John M. will hassle Dana if --enable-symbols doesn't work.
- Float16 RFC update
- Citing HDF5 PR
- HDF5 goings on
- Debug API?
- Do we need a recap of type conversion in the RFC?
- We should definitely have examples in the docs for when a particular type is not supported
- What about quad precision floats (128-bit, supported in GNU, numpy and Fortran already support, there's an IEEE standard)?
- Merged
- Nothing too exciting
- General agreement that some low-level API calls could go in a special debug header that would not be a formal part of the 'public API'
- Discuss propagating group creation properties to intermediate groups (https://github.com/HDFGroup/hdf5/issues/3945)
- Discuss float16 RFC
- We'll propagate the properties to the intermediate groups
- Might require some finagling to deal with non-group creation
- We should look at FP8 as well, possibly other ML/DL convenience reduced types
- Should also consider adding Boolean support from Jerome's RFC
- Need to look at NVidia's compilers
- NATIVE macros will map to H5I_INAVLID_HID when there is no compiler support for a type
- Complex should be a real atomic type, not a struct of two doubles or whatnot
- Might add some conversions to common current complex number hacks, like a struct of two doubles, etc.
- How will complex numbers be printed in h5dump, etc.?
- Discuss how the chunk cache discussion went
- Library PRs, issues, etc.
- John Mainzer / Lifeboat will be reworking the chunk cache for sparse data and thread-safety
- The chunk cache used to be per-file (currently per-dataset - performance was poor w/ per-file)
- The per-file --> per-dataset change was in 1.6
- John wants to go back to one cache per file or even one cache for many files
- John is exploring data structures and algorithms (lock-free hash tables suitable?)
- Actual work would only take place if Lifeboat gets a phase II grant (end of Feb)
- Jordan's type conversion fix is super important and needs to get into 1.14.4. Quincey will review.
- The -shared CMake extensions for tools, libraries, etc. will be removed by default in 1.14.4. CMake will work more like the Autotools, though compatibility options will be provided if you want the old behavior.
- Discuss https://github.com/HDFGroup/hdf5/pull/3927
- Determine 2024 / 1.14.4 priorities
- Quincey's warnhist improvements
- John Mainzer has a question about chunk cache documentation
- Elena Pourmal wants to discuss behavior when a compressed buffer is bigger after "compression"
- We need an updated guide for updating across major/minor releases
- Always need to reclaim vlen memory, even when using strings?
- Tentative okay on #3927, but we should look into how other software does things. Is this the way people will do this in the future?
- Do we have a DOI for HDF5? That should be a thing we add.
- Unicode
- MinGW (especially w/ MSYS2 - AND DOCUMENT AND TEST)
- Update to handle recent MPI session changes
- Support modern C and ML/DL data types (float16, bool, complex, etc.)
- Existing behavior: Larger buffer is written out as "compressed"
- Responsibility lies with filter
- Need to fix the zlib deflate filter (Elena will file a bug report, may also affect szip)
- Now can produce warning density and has more options
- Read the source, Luke
- Rob Matzke may have an old document (location?)
- John wants to discuss creating a more flexible chunk cache that can support sparse data better (will be in Champaign T/W next week)
- Version on web has dead links, is out of date
- h5py not doing a great job at cleaning up after itself, has a memory leak
- h5py needs updating to avoid the leak
- Is h5py doing the right thing with new-style references? <-- New-style references in general probably need better documentation