Release Notes v2022.01.20
Release Notes 2022-01-20
Major updates:
- Issue #786 - Server side complex object support and Symbol Table Type Hierarchy
How server side objects are managed has changed from individualized pdarrays to encapsulated complex objects. This reduces the complexity in the message passing layer where complex objects would pass multiple ids for each component and now need only pass one. The first implementation of the complex object is the SegString - segmented string array. A basic Type hierarchy for complex objects was also introduced so there is a root type stored in the Symbol Table. - Adding build support for Chapel version 1.25.1 which is now the Recommended version (see PR#1027).
- Modular server builds & internal I/O code refactor, see PR#1017 and Issue #1005. Developers can now choose to exclude various modules in the server build process by commenting them out in
ServerModules.cfg
(we plan to improve this capability in future releases) - Issue #963 & #940 - Performance improvements on regex and string search methods etc.
- Issue #985 (ongoing) - Parquet improvements regarding error handling, timestamps(Issue #1026), and performance; (PRs #1014, #992, #993, #1023, #1024, #1028)
- Issue #930 - Externally generated server tokens are now allowed.
Minor fixes:
- Issue #933 Documentation fix and final removal of Read-the-docs in favor of Github pages
- Issue #990 A logic error in Categoricals involving in1d was fixed.
- Apache Arrow version info has been add to the server configuration information (PR#995)
- Server configuration information is now cached instead of being recreated on each call.
- Issue #973 - new benchmarks were added for various data distributions
- Issue #914 was fixed by changing string writes to HDF5 by using aggregators
Auto-generated release notes
- 786 SegString as single entry (commits grouped by type) & Complex Object in Symbol Table by @glitch in #830
- Fix # 786 Adds typing hierarchy regarding SymEntry types to Parquet code. by @glitch in #991
- Add Parquet support information in env file by @bmcdonald3 in #992
- Closes #933 by fixing docs and removing read-the-docs in favor of Github pages by @glitch in #1003
- Refresh akutil by @reuster986 in #996
- Add handling of errors in Parquet code and minor clean up by @bmcdonald3 in #993
- Add Arrow versioning to ak.get_config() call by @bmcdonald3 in #995
- Remove out of date comment about poor sort performance on IB by @ronawho in #998
- Closes #999: adds call to super.init for SegStringSymEntry. by @glitch in #1000
- Add in1d to list of benchmarks to run by @ronawho in #1002
- Cache the server config string instead of recreating it by @ronawho in #1004
- Closes #990: Logic error in Categorical.in1d and other methods by @pierce314159 in #1001
- Data distributions for sort benchmarking #973 by @reuster986 in #977
- update README.md with ArkoudaWeeklyCall by @mhmerrill in #1009
- Optimize Parquet reading by reading batches rather than creating a copy by @bmcdonald3 in #1014
- Drop in1d down to 1 trial by @ronawho in #1021
- Issue #914: Uses aggregator for SegString to HDF5 writes by @glitch in #1016
- Modularize build process by @bmcdonald3 in #1017
- Change Parquet C++ types from int to int64_t and reorganization by @bmcdonald3 in #1023
- 980 allow externally generated server tokens by @hokiegeek2 in #1025
- Add support for reading timestamps in Parquet files by @bmcdonald3 in #1024
- re-add sort distributions benchmark by @reuster986 in #1029
- Recommend Chapel 1.25.1 and use it for CI testing by @ronawho in #1027
- Optimize Parquet file writing with WriteBatch function by @bmcdonald3 in #1028
- Part of Issue #940: Simplify Regex Substring Search Methods by @pierce314159 in #1030
- Issue 1005 file io refactor by @glitch in #1007
Full Changelog: v2021.12.02...v2022.01.20