Releases: Bears-R-Us/arkouda
Releases · Bears-R-Us/arkouda
v2022.02.23
- Support for a few remaining operations on
uint64
- Improvements and tuning of
in1d
- Support for listing columns of parquet files via
ak.get_datasets()
v2022.02.16
Highlights:
v2022.02.01
Highlights
- Modular build process allows building with external modules
- Optimized
ak.in1d()
- Improved Parquet read performance
- Fixed precision issue affecting grouped sum on float64 and related bug in grouped AND/OR
What's Changed
- Update compatibility to numpy - 1.21.5 by @glitch in #1033
- Speed up gasnet unit tests by only using 1 thread per locale by @ronawho in #1035
- Bump
substring_search
problem size back to10**8
by @ronawho in #1038 - fixes float precision bug #964 by @reuster986 in #1036
- Revert "fixes float precision bug #964" by @mhmerrill in #1043
- Add support for passing server args to runClient/run_benchmarks.py by @ronawho in #1041
- Closes #1031: Remove non-regex substring search by @pierce314159 in #1032
- Optimize small and medium int in1d operations by @ronawho in #1044
- Refactor binary operator function with doBinOp function by @bmcdonald3 in #1034
- Allow modules from outside the Arkouda source dir to be built into the server by @bmcdonald3 in #1047
- Parallelize Parquet file reading on-node by @bmcdonald3 in #1050
- Issue #1049: Adds capability to print server commands from the client. by @glitch in #1052
- Update modularization docs to more clearly specify absolute paths by @bmcdonald3 in #1056
- #964 Fix sum precision for real this time by @reuster986 in #1055
Full Changelog: v2022.01.20...v2022.02.01
Release Notes v2022.01.20
Release Notes 2022-01-20
Major updates:
- Issue #786 - Server side complex object support and Symbol Table Type Hierarchy
How server side objects are managed has changed from individualized pdarrays to encapsulated complex objects. This reduces the complexity in the message passing layer where complex objects would pass multiple ids for each component and now need only pass one. The first implementation of the complex object is the SegString - segmented string array. A basic Type hierarchy for complex objects was also introduced so there is a root type stored in the Symbol Table. - Adding build support for Chapel version 1.25.1 which is now the Recommended version (see PR#1027).
- Modular server builds & internal I/O code refactor, see PR#1017 and Issue #1005. Developers can now choose to exclude various modules in the server build process by commenting them out in
ServerModules.cfg
(we plan to improve this capability in future releases) - Issue #963 & #940 - Performance improvements on regex and string search methods etc.
- Issue #985 (ongoing) - Parquet improvements regarding error handling, timestamps(Issue #1026), and performance; (PRs #1014, #992, #993, #1023, #1024, #1028)
- Issue #930 - Externally generated server tokens are now allowed.
Minor fixes:
- Issue #933 Documentation fix and final removal of Read-the-docs in favor of Github pages
- Issue #990 A logic error in Categoricals involving in1d was fixed.
- Apache Arrow version info has been add to the server configuration information (PR#995)
- Server configuration information is now cached instead of being recreated on each call.
- Issue #973 - new benchmarks were added for various data distributions
- Issue #914 was fixed by changing string writes to HDF5 by using aggregators
Auto-generated release notes
- 786 SegString as single entry (commits grouped by type) & Complex Object in Symbol Table by @glitch in #830
- Fix # 786 Adds typing hierarchy regarding SymEntry types to Parquet code. by @glitch in #991
- Add Parquet support information in env file by @bmcdonald3 in #992
- Closes #933 by fixing docs and removing read-the-docs in favor of Github pages by @glitch in #1003
- Refresh akutil by @reuster986 in #996
- Add handling of errors in Parquet code and minor clean up by @bmcdonald3 in #993
- Add Arrow versioning to ak.get_config() call by @bmcdonald3 in #995
- Remove out of date comment about poor sort performance on IB by @ronawho in #998
- Closes #999: adds call to super.init for SegStringSymEntry. by @glitch in #1000
- Add in1d to list of benchmarks to run by @ronawho in #1002
- Cache the server config string instead of recreating it by @ronawho in #1004
- Closes #990: Logic error in Categorical.in1d and other methods by @pierce314159 in #1001
- Data distributions for sort benchmarking #973 by @reuster986 in #977
- update README.md with ArkoudaWeeklyCall by @mhmerrill in #1009
- Optimize Parquet reading by reading batches rather than creating a copy by @bmcdonald3 in #1014
- Drop in1d down to 1 trial by @ronawho in #1021
- Issue #914: Uses aggregator for SegString to HDF5 writes by @glitch in #1016
- Modularize build process by @bmcdonald3 in #1017
- Change Parquet C++ types from int to int64_t and reorganization by @bmcdonald3 in #1023
- 980 allow externally generated server tokens by @hokiegeek2 in #1025
- Add support for reading timestamps in Parquet files by @bmcdonald3 in #1024
- re-add sort distributions benchmark by @reuster986 in #1029
- Recommend Chapel 1.25.1 and use it for CI testing by @ronawho in #1027
- Optimize Parquet file writing with WriteBatch function by @bmcdonald3 in #1028
- Part of Issue #940: Simplify Regex Substring Search Methods by @pierce314159 in #1030
- Issue 1005 file io refactor by @glitch in #1007
Full Changelog: v2021.12.02...v2022.01.20
Release Notes v2021.12.02
Highlights
- Introduces optional support for Parquet, Issue #903
- Official move to Chapel 1.25.0 (with backwards compatibility for Chapel 1.24.x). See Issue #954 and
- General support for HDF5 1.10.x and 1.12.x (Arkouda 1-D pdarray read/write with HDF5) See Issue #975 and #979
- Chapel's
twoArrayRadixSort
is now a runtime sorting option, see #984
Minor changes
Regex functionality and Categorical perf fix
The two major updates in this release are:
- Full implementation of the python
re
API for regex match/search/split/findall - Fixed a performance bug where it was taking inordinately long to display the head/tail of a Categorical with a large number of codes
Additionally, this release saw the complete transition to Chapel 1.25 in both the docs and CI.
2021.10.07 fixes and improvements
Highlights
It's only been a week since the last release, but several important things have happened:
- #941 fixes a bug affecting correctness of the LSD radix sort in a corner case
- #945 implements element-wise bit operations like popcount, clz, and rotations
- #935 greatly improves performance for operations that allocate many small arrays
- #931 speeds up string regex operations by improving how string segments are localized
This is an update of the (now deleted) v2021.10.06
release to incorporate the bug fix in #949 . I could not reuse the tag, so I future-dated it.
v2021.09.30
2021-09-30 Release Notes
New Functionality
- Strings regex support (Issues #894, #910, #911, #917) using
re2
library.- Adds
findall
,find_locations
,match
functions with regex support - Adds regex support for
contains
,endswith
,startswith
,flatten
,peel
,rpeel
- Adds
- Issue #822 : Adds hdf5 save/load functionality for Categoricals
- Issue #919 : Hashing for general arrays
Benchmarking updates
- Results for scalability & sorts benchmakr runs were added to the
runs
directory - Issue #930 speed up answer creation for flatten benchmark
Chapel related support/improvements
- Support for Chapel v1.25.0 by updating GenSymIO to use new
subprocess.exitCode
(see PR#916) - Use of interleave-memory if available (see PR#913)
- PR#935 memTrack/memThreshold configuration change for performance improvement
Misc
August 20, 2021 Release
Merge pull request #896 from glitch/887-derive-offsets Closes #887 : Implements an option to derive SegString offsets/segments array
August 06, 2021 Release
Bugfix #883
Faster array transfers and more!