From 464180ff28e6a3f74f7c754ec01ed8a6a2f978df Mon Sep 17 00:00:00 2001 From: Devin Matthews Date: Thu, 19 Dec 2024 11:47:44 -0600 Subject: [PATCH] CHANGELOG update (1.1) --- CHANGELOG | 596 +++++++++++++++++------------------------------------- 1 file changed, 187 insertions(+), 409 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index 76691e13d0..a4e2681ea4 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,34 +1,134 @@ -commit c2af113c7ba6d0dcc128ba36ec6e140d89180cf3 (HEAD -> master) +commit c00b9c748dbf271e5719dedd1184383efe662b44 +Author: Devin Matthews +Date: Thu Dec 19 11:47:44 2024 -0600 + + Version file update (1.1) + +commit 623f6b9524204dbf6965f34cda52b23d6a51da96 +Author: Devin Matthews +Date: Thu Dec 19 11:41:26 2024 -0600 + + ReleaseNotes.md update. + +commit a1709671a553bcfab2a16c6ed9f650bc5f1a7e84 +Author: Devin Matthews +Date: Thu Dec 19 11:41:12 2024 -0600 + + CREDITS file update. + +commit d6d2c88adebc99bd4a8720be2b2ab20a85794712 Author: Field G. Van Zee -Date: Mon May 6 13:37:47 2024 -0500 +Date: Thu Aug 8 13:34:37 2024 -0500 - Version file update (1.0) + Fixed out-of-bounds read bug in sup haswell ukr. (#824) + + Details: + - Fixed a bug in the bli_sgemmsup_rd_haswell_asm_1x16n() millikernel. + The kernel was erroneously performing an out-of-bounds read whenever + the singleton edge case loop executed (that is, whenever the k + dimension of the millikernel problem was not a multiple of 8). This + OOB error was the result of a copy-paste bug; when developing the + s1x16n function, I started from a copy of the s2x16n function, but + then failed to delete the instruction that reads the second element + of A in the code that handles the PR loop's edge case. Thanks to + @j-bm for reporting this bug in Issue #821 and helping narrow down + the cause to the rax register. + - CREDITS file update. + +commit e6f7d80c700a253e7c52a74425eb3bef00bcb3fb +Author: Field G. Van Zee +Date: Wed Jun 26 16:18:21 2024 -0500 + + Fix a bug in the piledriver microkernels. (#814) + + Details: + - At some point, the piledriver (and bulldozer and excavator) + microkernel tests via SDE had been removed from Travis CI testing. + This PR re-enables them. + - A bug in the piledriver complex gemm microkernels has also been + fixed. The beta*C product was not being correctly added to the A*B + product before writing back out to memory. + - Fixes #811. + - (cherry picked from commit 31ecf820b9eb3368ad907ae6b192bf7397ebc92c) + +commit dce9d2a1f903edb6b9055d269cf28be55ce86527 +Author: Field G. Van Zee +Date: Wed Jun 26 16:14:34 2024 -0500 + + Add ScaLAPACK compatibility mode. (#813) + + Details: + - Add configure options '--enable-scalapack-compat' and + '--disabled-scalapack-compat' (default disabled). + - Add a macro BLIS_{ENABLE,DISABLE}_SCALAPACK_COMPAT to bli_config.h. + - This option and macro control any changes to the API necessary to + maintain compatibility with ScaLAPACK. Currently, this only means + disabling the complex versions of syr, syr2, and symv. In the + future, other changes could be controlled by the same flag. + - Complex syr2 wasn't enabled at the same time that complex syr and + symv were. This is now corrected. + - (cherry picked from commit 415893066e966159799d96166cadcf9bb5535b1c) + + Fixed typo in 4158930; variable renames. (#815) + + Details: + - Fixed a typo in the "./configure --help" output for the ScaLAPACK + compatibility option implemented in 4158930. + - Trivial variable renames. + - (cherry picked from commit 8820f8f91efd32e38e2995e73323656ef767bbd8) -commit 5ab286f61525f8ead35ecc258305a5ccd4ee096b (origin/master, origin/HEAD) +commit 1a6772feb1749faa2b42d30ae720738087ef6967 Author: Field G. Van Zee -Date: Mon May 6 13:14:52 2024 -0500 - - Added a script to help create new rc branches. - - Details: - - Added a new script, build/start-new-rc.sh, which: - 1. Updates the version file with a new version string. - 2. Commits (locally) the version string update. - 3. Updates the CHANGELOG file with the output of 'git log'. - 4. Commits (locally) the CHANGLOG file update. - 5. Creates a new branch whose name is equal to "-rc0" where - is the new version string. - 6. Reminds the user to execute some final steps if everything looks - good. - This new script will help in the future when it's time to start a new - release candidate branch/lineage off of 'master'. Note that this - script is based on build/bump-version.sh (which itself may change in - the future due to changes in the way versions/releases will be handled - going forward). - -commit cad51491e8a0b306015a5a02881dc2a9b60dd8d9 +Date: Tue Jun 4 13:47:05 2024 -0500 + + Fix SyntaxWarning messages from python 3.12 (#809) + + Details: + - When using regexes in Python, certain characters need backslash + escaping, e.g.: + + regex = re.compile( '^[\s]*#include (["<])([\w\.\-/]*)([">])' ) + + However, technically escape sequences like `\s` are not valid and + should actually be double-escaped: `\\s`. Python 3.12 now warns about + such escape sequences, and in a later version these warning will be + promoted to errors. See also: + https://docs.python.org/dev/whatsnew/3.12.html#other-language-changes + The fix here is to use Python's "raw strings" to avoid + double-escaping. This issue can be checked for all files in the + current directory with the command: + + python -m compileall -d . -f -q . + + Thanks to @AngryLoki for the fix. + - (cherry picked from commit 729c57c15aa50030145ff702626c31839ded3502) + + Update CREDITS + + - (cherry picked from commit 5cbec6503de335b3b63fa5d4f388fddd3aff2b61) + +commit 49af2243c2a60ed8fedb44f237f4ec100465cd89 Author: Field G. Van Zee -Date: Tue Apr 30 16:46:54 2024 -0500 +Date: Mon May 6 14:07:33 2024 -0500 + + ReleaseNotes.md update. + + Details: + - (cherry picked from commit 06dddf1e51ccff70d77ee8cb731c3217e70eb730) + + CHANGELOG update (1.0) + + Details: + - (cherry picked from commit a876918c8c79a1c3d3d95de1f283350b7249b8ae) + + Version file update (1.0) + + Details: + - (cherry picked from commit c2af113c7ba6d0dcc128ba36ec6e140d89180cf3) + +commit 7d486312c8c04afb81e2e424daf25aa65f758069 +Author: Field G. Van Zee +Date: Tue Apr 30 17:13:15 2024 -0500 Use "-i auto" by default in test/3 drivers. @@ -38,10 +138,11 @@ Date: Tue Apr 30 16:46:54 2024 -0500 script present in that directory. (Previously, the runme.sh script would use "-i native" by default.) This change was originally intended for fd1a7e3. + - (cherry picked from commit cad51491e8a0b306015a5a02881dc2a9b60dd8d9) -commit fd1a7e3ca9547718aa61c806848099705216182b +commit 5eff5f931bc97decaf318dc8efa1cbcf33a09eb5 Author: Field G. Van Zee -Date: Thu Apr 25 15:00:59 2024 -0500 +Date: Tue Apr 30 17:12:49 2024 -0500 Allow test/3 drivers to use default ind_t method. (#804) @@ -67,311 +168,11 @@ Date: Thu Apr 25 15:00:59 2024 -0500 for finding and reporting this issue. - Also added support for "nat" as a shorthand for "native", which the help text already (erroneously) claimed was supported. + - (cherry picked from commit fd1a7e3ca9547718aa61c806848099705216182b) -commit a49238e6141c96a41aa3c2a4adb0b0663d0b4968 -Author: Devin Matthews -Date: Wed Apr 24 15:07:18 2024 -0500 - - Refactor the control tree and other infrastructure (#710) - - Details: - 1. A "plugin" architecture. - - Users are now able to register new kernels, kernel preferences, and - blocksizes at runtime, directly from user applications. - - Plugins can be created, configured, and built using only an installed - version of BLIS -- no source or source code changes required. - - Plugins support both reference and optimized kernels, as well as - custom configuration-to-kernel-set mappings. - - Building plugins (including reference and relevant optimized kernels) - for enabled architectures or architecture families is automated, as is - linking into the final library. - - The configure script is now installed as 'configure-plugin'. In this - mode, it can be used to initialize a plugin from a template including - optional example code, and prepare a build system for compiling the - plugin into a shared or static library. - - Additional configuration files, templates, and build system components - are also installed to '%prefix%/share/blis'. - - The cntx_t struct now has extensible data structures for holding - kernels, preferences, and blocksizes. These are based on a "stack" - structure which contains a list of fixed-size data blocks. Adding a - new entry (which may require allocating a new block or reallocating - the block pointer array) requires locking, but looking up entries is - lock-free and takes O(1) time. - - Kernels can depend on either 1 or 2 type parameters (e.g. - mixed-precision packing requires 2). The func2_t struct supports - the latter, but can be implicitly cast to func_t if only "diagonal" - entries are needed. The number of type parameters can be inferred from - the kernel ID for type safety. - - Functions have been added to register new kernels, preferences, and - blocksizes with the global kernel structure (gks). This creates - corresponding entries in each allocated context and returns the next - available ID. Plugins use this API to register user kernels, although - the user is responsible for tracking the returned IDs for later - lookup. Setting newly-registered reference kernels, as well as - overriding these with optimized kernels is done in exactly the same - manner as in bli_cntx_init_ref() and bli_cntx_init_(). - - 2. Restructuring of the control and thread control trees. - - The control tree has been substantially restructured to support more - flexibility. - - The "default" control trees for gemm (also used for - hemm/symm/herk/her2k/syrk/syr2k/trmm/trmm3) and trsm are now - represented as a single structure containing all necessary control - tree nodes and parameters. - - An API has been added to modify the default gemm/trsm control trees. - - This same API is used by the framework and packm/gemm/trsm variants - to access specific control tree nodes. - - Users can alternatively create a custom control tree from scratch. - - The blocksizes are now encoded directly in the control tree, rather - than via loop IDs. The logic for adjusting blocksizes for certain - operations has been moved to the control tree initialization. - - Type information is encoded in the control tree to drive proper - selection of packing and computational kernels provided by the user. - - The packing microkernel now receives an opaque "params" struct which - is user-definable and can be used to pass additional information - through the call stack. - - The auxinfo_t struct has been updated with a .params field for - opaque user data as well as the global offsets of the current - microtile. - - The packm and gemm variants can be overridden by the user, and also - receive an opaque params struct via the associated control tree - node. - - The structure-aware packing kernel bli_packm_struc_cxk() is no longer - hard-coded to be called from the default packm variant, but can be - overridden by the user. It also supports mixed-precision/mixed-domain - natively now. - - The thread control tree (thrinfo_t) is now created entirely up-front - by inspecting the control tree. The required number of threads at each - level is encoded in the control tree via loop IDs (actually a bitfield - of loop IDs), although the ordering and number of such IDs is - arbitrary. The logic for adjusting the number of threads at each level - based on operation type (e.g. trmm) is now in the control tree - initialization and expressed by combining loop IDs from multiple - levels into a single level. - - The mem_t object containing the pack buffer pointer has been moved - from the control tree to the thread control tree. NOTE: **The control - tree is now strictly const throughout the operation, and only a - single copy is shared by all threads.** - - The thread control tree node for packing has been changed so that - there is no longer a "fake" node indicating a team of single threads. - Instead, the number of threads and thread IDs in the "normal" thread - control tree node are used. This change has also been made to the - gemmsup thread control tree and packing variants, as well as to the - gemmlike sandbox. - - Parameters controlling packing (e.g. inversion of the diagonal, - direction, schema) are not stored directly in the control tree but in - the opaque params struct. The packing control tree node and its - default params struct are stored together in the "combined" - gemm/trsm control tree structure and initialized as a unit. Users can - update these parameters individually or substitute a custom packm - variant and params struct. - - The "target" and "execution" datatypes has been removed from the obj_t - struct and replaced by type information in the control tree. - - The "sub-node" and "sub-prenode" of a control tree node have been - replaced by an arbitrary number of sub-nodes accessed by index. There - is a hard cap on the number of sub-nodes (currently 2). Sub-nodes are - added during control tree initialization, *after* - creation/initialization of the parent node through an updated API. - - The level-3 thread decorator has been significantly simplified and - directly calls bli_l3_int(). The control tree is created externally, - and it is no longer necessary to alias matrices or set object pack - schemas. Also, the rntm_t passed in may be NULL. Finally, family - and scalar information is no longer needed here. - - bli_l3_int() is now a simple inline function which extracts the next - control tree node and variant and calls it. - - bli_*_front() have been removed and inlined into the expert object - API with significant simplification. - - 1m (or other induced method) no longer uses an alternative cntx_t. - - The .pack_fn/.ker_fn pointers and associated params fields on the - obj_t were removed in favor of the present solution. - - 3. Overhaul of variable substitution in configure script. - - The configure script has been somewhat re-written to use a - centralized mechanism for substituting variables into build system and - other configuration files. - - All substitution variables go through the same pathway now, which - necessitated some variable naming changes for variables which were - named the same in e.g. Makefile and bli_config.h but with - different definitions. - - CC and CXX variables can now contain spaces, e.g. 'g++ -std=c++17'. - This provides better support for integration with build tooling such - as autotools. - - 4. Overhaul of packing kernels. - - Previously there were two packing kernels referenced in the cntx_t - structure for MRxk and NRxk shaped micropanels, respectively. These - have now been merged into one kernel which is responsible for packing - any dense rectangular portion of either A or B. - - The packing kernel now receives information about the register - blocksize (cdim_max) and duplication factor (the "broadcast-B" - format, although this can also apply to the A matrix). - - The structure-aware packing kernel (bli_packm_struc_cxk(), which is - now user-overridable) also receives global offsets of the current - micropanel within A or B. - - Explicit kernels for packing the diagonal blocks of - triangular/symmetric/Hermitian matrices have been added to the - cntx_t. This means that the bli_packm_struc_ckx() "kernel" no longer - needs to directly touch data (except to zero out some regions). - - bli_packm_struc_cxk() has also been updated to work only in terms of - fundamental elements (i.e., real datatypes) when computing offsets and - when zeroing data, which greatly simplifies mixed-domain/1m packing. - - bli_packm_scalar() has been updated to better support complex scalars - in mixed-domain operations. - - Pack schemas for PACKED_ROW_PANELS* and PACKED_COL_PANELS* have - been merged into simply PACKED_PANELS*. This reflects the merging of - the packing kernels into a single generic kernel. There were only a - very few places which needed the row/column information and this is - now supplied by alternative means. - - Packing variants always behave "as if" the A matrix were being packed - (i.e. the code assumes packing column-stored row panels). Packing of B - is handled by applying an implicit or explicit transpose before - packing. This change also applies to gemmsup. - - 5. Improved MD/MP support. - - All level-3 operations (except trsm) now support full - mixed-domain/mixed-precision operation. - - Explicit 1m packing kernels have been added in the cntx_t. - - An explicit 1m microkernel wrapper has been added to the cntx_t. - - An extra packing kernel for the "ro" format has been added, along with - the pack_t enumeration value. This supports the packing for - real*complex -> real, including potential scaling by a complex alpha, - support for structured matrices, etc. - - Extra microkernel wrappers for mixed-domain operations have been added - to support the 'ccr' (and by extension, 'crc'), 'rcc', and 'crr' - cases. Notably this includes full support for general stride storage - and complex alpha/beta. - - Packing kernels and gemm microkernels are now "templated" based on two - type parameters rather than one. For packing this allows direct - optimization of mixed-precision kernels, and for gemm microkernels - this allows direct optimization of mixed-precision without writing to - a temporary buffer. Reference packing kernels are directly - instantiated for all mixes of precisions, while by default - mixed-precision gemm microkernels are supported via a microkernel - wrapper. The "old" way of specifying optimized kernels using a single - type parameter works unchanged. - - alpha and beta are typecast appropriately to the computational or - output datatype, respectively, and **always** to the complex domain. - Scalar typecasting has also been added to gemmsup for safety. - - The gemm macrokernel doesn't have to do any typecasting anymore, as a - microkernel wrapper or optimized mixed-precision/mixed-domain kernel - now handles this. - - 1m and mixed-domain operations now always use a microkernel wrapper, - rather than adjusting parameters in the gemm macrokernel. - - The gemmt macrokernel **does** still have to handle explicit - write-back of microtiles which intersect the diagonal, although - typecasting has already been performed. - - The gemmt_x_ker_var2(), trmm_xx_ker_var2(), and trsm_xx_ker_var2() - functions have been removed. The appropriate macrokernel pointer is - selected during control tree initialization. - - Real domain MR/NR are checked for even-ness based on the gemm - microkernel's row preference in order to guarantee proper 1m and - mixed-domain operation. - - Full range of mixed-domain/mixed-precision functionality tested in the - testsuite ('input.*.mixed'). - - 6. Other changes: - - The build system has been updated to support C++ source files - throughout the framework. While the intent is not to add such files to - BLIS itself, this supports plugins written in C++. - - Many instances of configuration-specific code have been simplified by - introducing an INSERT_GENTCONF macro which instantiates a block of - code for each enabled sub-configuration. The ConfigurationHowTo.md - document has been updated accordingly. - - PASTEMAC?/PASTECH?/PASTEF77? have been removed in favor of - variadic macros which accept any number of arguments (up to a - reasonable limit). - - The INSERT_GENTFUNC* macros have been updated to clean up - mixed-precision and mixed-domain instantiations. - - bli_align_dim_to_mult() has been updated to support rounding either up - or down based on a flag. - - Checking for empty matrices and other early exits (level-3 only) has - been consolidated into a single utility function. - - The auxinfo_t struct is always passed as const. - - The new function bli_obj_alias_submatrix() aliases a matrix while also - resetting the root to NULL, offsets to zero (while adjusting the - buffer), and applying any implicit transpose. - - Level-3 pruning functions now only check matrix structure to see what - to do, not the operation family. - - gemmsup packing has been updated to use the "normal" pack buffer - allocation routines. - - Remove duplicate checks for early return from gemmsup handler. - - bli_determine_blocksize() has been significantly simplified. - - Partitioning packed panels is no longer allowed. - - Added bli_xxsame macros. - - Automated the calculation of info bit shifts and masks based on - predefined bit sizes for various flags. This greatly simplifies - reordering, adding, or removing flags from the info/info2 bitfields. - - Moved more BLIS_NUM_* macros into the corresponding enums as the - last entry so that the value is automatically computed. - - Better const-correctness in some level0 scalar macros. - - Better mixed-precision support in some level0 scalar macros. - - Added a bli_axpbys_mxn() macro. - - bli_thread_range_sub() takes explicit thread ID and number of threads - rather than a thrinfo_t node. - - "De-templated" BLIS gemmlike sandbox (specifically, bls_gemm_bp_var1() - and bls_packm_var1()). - - Combined bls_l3_packm_[ab]() into one function with thin wrappers. - - Deleted bls_packm_var[23](). - - Add a "termination tag" to the testsuite output so that - 'make check-blis' can accurately check for successful completion. - - Add a new function to centrally compute FLOPs for level-3 operations - in the testsuite. - -commit a316d2c6c33fc1f8f7c58c4210ab203f48349041 -Author: Devin Matthews -Date: Thu Mar 28 12:52:00 2024 -0500 - - Fix incorrect commenting of `BLIS_RNTM_INITIALIZER` and `BLIS_OBJECT_INITIALIZER`. - -commit 664cc6bc3ea610b4ecea63d78c6024c48f045635 -Author: Devin Matthews -Date: Tue Mar 26 16:25:17 2024 -0500 - - Update BLIS_*_INITIALIZER macros for C++ compatibility. (#802) - - Details: - - Remove designated initializer syntax. This isn't officially supported - until C++20. - - Arrange initializers in the order in which they are defined in the - struct. Even with standard or extension support for designated - initializers, initializing non-static members out-of-order is an - error in C++. - - Remove the conditional code which uses '-1' as the default value of - the 'pack_buf' member of 'mem_t' in C, but 'BLIS_BUFFER_FOR_GEN_USE' - in C++. Simply use the latter as a common-sense default. - -commit 1a8c8180b32cf5988bf9eb5d2f0f8111a729993a -Author: John <50754967+j-bm@users.noreply.github.com> -Date: Thu Feb 15 12:35:10 2024 -0400 - - Add cpu part codes for various manufacturers and use in the code (#794) - - * Add cpu_id symbols for arm v8. - - * Add symbols for arm v7. - - * Always assume firestorm on Apple aarch64. - - * Fixes incorrect usage of model vs. part in some places. - - * Fixes #793 - - --------- - - Co-authored-by: J - -commit c382d8bdccc07e22a341fe04960f0cbf4eec083b -Author: Igor Zhuravlov -Date: Sun Jan 14 04:03:31 2024 +0000 - - Fix errors and typos in docs/BLIS*API.md (#791) - - Details: - - Fixed errors and unified formatting in docs/BLIS*API.md docs. - -commit a72e4569f2a03cc3578c019bf7ce25491a44137d +commit 968c9be404763b48e72f218598c7edd2bd571780 Author: Field G. Van Zee -Date: Wed Dec 6 18:21:47 2023 -0600 +Date: Tue Dec 12 13:34:31 2023 -0600 Include bli_config.h before bli_system.h in cblas.h. (#789) @@ -382,10 +183,11 @@ Date: Wed Dec 6 18:21:47 2023 -0600 affected cblas.h -- blis.h had been correctly #including bli_config.h before bli_system.h since fb93d24. Thanks to Edward Smyth for reporting this bug and suggesting the fix. + - (cherry picked from commit a72e4569f2a03cc3578c019bf7ce25491a44137d) -commit 1236ddab455ef3a6293ab394ff06b3a19c2913d9 +commit 4e68494012722323212139647cdfb944553a4842 Author: Field G. Van Zee -Date: Sun Dec 3 16:42:34 2023 -0600 +Date: Tue Dec 12 13:34:06 2023 -0600 Fixed random segfault in test/3 drivers. (#788) @@ -401,10 +203,11 @@ Date: Sun Dec 3 16:42:34 2023 -0600 those strings was a NULL pointer. I'm not sure how this code ever worked to begin with. Special thanks to Leick Robinson for finding and reporting this bug. + - (cherry picked from commit 1236ddab455ef3a6293ab394ff06b3a19c2913d9) -commit 141a6c9a8e7557d9c7d28aecedec9dc5377dba13 +commit 8109d18972c4211909ffa978f7a37db669ddc8b0 Author: Field G. Van Zee -Date: Tue Nov 21 12:26:43 2023 -0600 +Date: Tue Dec 12 13:32:11 2023 -0600 Install helper headers to INCDIR prefix. (#787) @@ -426,34 +229,9 @@ Date: Tue Nov 21 12:26:43 2023 -0600 - Harmonized the rule in the top-level Makefile for installing blis.pc into SHAREDIR/pkgconfig with conventions for others vis-a-vis verbosity/non-verbosity. + - (cherry picked from commit 141a6c9a8e7557d9c7d28aecedec9dc5377dba13) -commit 2d9439298b336aa6d0ee000a5285a3adb4e6d462 -Author: Devin Matthews -Date: Tue Nov 21 12:18:07 2023 -0600 - - Allow users to defines [sd]complex using std::complex (#784) - - Details: - - In C++ applications, it makes a lot of sense to interface to BLIS - using C++'s standard complex number library, which uses a template - class std::complex. Obviously BLIS doesn't know anything about this - and defaults to a custom struct to represent complex numbers. This PR - updates the bli_[cz]{real,imag}() functions to accept std::complex - numbers when a C++ compiler is being used. Note that this has no - effect on the compilation of the BLIS library (or testsuite), and only - comes into play when including blis.h into a C++ project and forcing - the use of std::complex for scomplex and dcomplex. - - The application can explicitly request std:complex-based types via: - - #define BLIS_ENABLE_STD_COMPLEX - #include - // Call BLIS functions using std::complex here. - - - Fixed a bug in the definition of some scalar level-0 macros, since - bli_creal()/bli_cimag() and bli_zreal()/bli_zimag() are no longer - interchangeable. - -commit f7ce54a252028483e4c6af619015eb22063d5541 (origin/1.0-rc0) +commit f7ce54a252028483e4c6af619015eb22063d5541 Author: Field G. Van Zee Date: Fri Nov 3 15:52:57 2023 -0500 @@ -481,7 +259,7 @@ Date: Fri Nov 3 13:30:31 2023 -0700 - Special thanks to Lee Killough, Devin Matthews, and Angelika Schwarz for their engagement on this commit. -commit 7a87e57b69d697a9b06231a5c0423c00fa375dc1 (origin/10.0-rc0) +commit 7a87e57b69d697a9b06231a5c0423c00fa375dc1 Author: Srinivas Yadav <43375352+srinivasyadav18@users.noreply.github.com> Date: Sat Oct 14 02:05:41 2023 -0500 @@ -827,7 +605,7 @@ Date: Wed Jul 26 14:37:08 2023 -0500 performance in BLIS. - Move RISC-V autodetect header files to build/detect/riscv/. -commit a0b04e3c007f1207e5678bf20c07752906742fb7 (origin/aocl-blas, aocl-blas) +commit a0b04e3c007f1207e5678bf20c07752906742fb7 Author: Field G. Van Zee Date: Mon Jun 26 17:59:21 2023 -0500 @@ -1165,7 +943,7 @@ Date: Fri Mar 24 20:05:13 2023 -0500 scheduled to be built. Thanks to Nick Knight for reporting this issue. - CREDITS file update. -commit 72c37eb80f964b7840377076e5009aec5b29d320 (origin/riscv) +commit 72c37eb80f964b7840377076e5009aec5b29d320 Author: Lee Killough <15950023+leekillough@users.noreply.github.com> Date: Thu Mar 23 16:01:55 2023 -0500 @@ -2956,7 +2734,7 @@ Date: Wed Apr 13 15:59:06 2022 -0500 initialization statements. - Whitespace changes. -commit ae10d9495486f589ed0320f0151b2d195574f1cf (origin/amd) +commit ae10d9495486f589ed0320f0151b2d195574f1cf Author: Devin Matthews Date: Wed Apr 6 20:31:11 2022 -0500 @@ -3008,7 +2786,7 @@ Date: Fri Apr 1 08:12:06 2022 -0500 CHANGELOG update (0.9.0) -commit 14c86f66b20901b60ee276da355c1b62642c18d2 (tag: 0.9.0) +commit 14c86f66b20901b60ee276da355c1b62642c18d2 Author: Field G. Van Zee Date: Fri Apr 1 08:12:06 2022 -0500 @@ -4066,7 +3844,7 @@ Date: Thu Oct 7 13:47:22 2021 -0500 ARMv8 PACKM and GEMMSUP Kernels + Apple Firestorm Subconfig -commit 2329d99016fe1aeb86da4552295f497543cea311 (origin/1m_row_col_problem) +commit 2329d99016fe1aeb86da4552295f497543cea311 Author: Devin Matthews Date: Thu Oct 7 12:37:58 2021 -0500 @@ -5886,7 +5664,7 @@ Date: Mon Mar 22 17:42:33 2021 -0500 CHANGELOG update (0.8.1) -commit 8535b3e11d2297854991c4272932ce4974dda629 (tag: 0.8.1) +commit 8535b3e11d2297854991c4272932ce4974dda629 Author: Field G. Van Zee Date: Mon Mar 22 17:42:33 2021 -0500 @@ -6433,7 +6211,7 @@ Date: Thu Nov 19 13:33:37 2020 -0600 CHANGELOG update (0.8.0) -commit 9b387f6d5a010969727ec583c0cdd067a5274ed8 (tag: 0.8.0) +commit 9b387f6d5a010969727ec583c0cdd067a5274ed8 Author: Field G. Van Zee Date: Thu Nov 19 13:33:37 2020 -0600 @@ -7613,7 +7391,7 @@ Date: Tue Apr 7 14:41:45 2020 -0500 CHANGELOG update (0.7.0) -commit 68b88aca6692c75a9f686187e6c4a4e196ae60a9 (tag: 0.7.0) +commit 68b88aca6692c75a9f686187e6c4a4e196ae60a9 Author: Field G. Van Zee Date: Tue Apr 7 14:41:44 2020 -0500 @@ -7992,7 +7770,7 @@ Date: Tue Jan 14 16:01:34 2020 -0600 CHANGELOG update (0.6.1) -commit 10949f528c5ffc5c3a2cad47fe16a802afb021be (tag: 0.6.1) +commit 10949f528c5ffc5c3a2cad47fe16a802afb021be Author: Field G. Van Zee Date: Tue Jan 14 16:01:33 2020 -0600 @@ -9383,7 +9161,7 @@ Date: Mon Jun 3 18:37:20 2019 -0500 CHANGELOG update (0.6.0) -commit 18c876b989fd0dcaa27becd14e4f16bdac7e89b3 (tag: 0.6.0) +commit 18c876b989fd0dcaa27becd14e4f16bdac7e89b3 Author: Field G. Van Zee Date: Mon Jun 3 18:37:19 2019 -0500 @@ -9471,7 +9249,7 @@ Date: Fri May 31 12:22:44 2019 +0530 Change-Id: I24fd0bf99216f315e49f1c74c44c3feaffd7078d -commit abd8a9fa7df4569aa2711964c19888b8e248901f (origin/pfhp) +commit abd8a9fa7df4569aa2711964c19888b8e248901f Author: Field G. Van Zee Date: Tue May 28 12:49:44 2019 -0500 @@ -10275,7 +10053,7 @@ Date: Tue Mar 19 17:07:20 2019 -0500 CHANGELOG update (0.5.2) -commit 9204cd0cb0cc27790b8b5a2deb0233acd9edeb9b (tag: 0.5.2) +commit 9204cd0cb0cc27790b8b5a2deb0233acd9edeb9b Author: Field G. Van Zee Date: Tue Mar 19 17:07:18 2019 -0500 @@ -11392,7 +11170,7 @@ Date: Tue Dec 18 14:56:20 2018 -0600 CHANGELOG update (0.5.1) -commit e0408c3ca3d53bc8e6fedac46ea42c86e06c922d (tag: 0.5.1) +commit e0408c3ca3d53bc8e6fedac46ea42c86e06c922d Author: Field G. Van Zee Date: Tue Dec 18 14:56:16 2018 -0600 @@ -12130,13 +11908,13 @@ Date: Fri Oct 26 17:07:15 2018 -0500 output. - Very minor edits to docs/MixedDatatypes.md. -commit e90e7f309b3f2760a01e8e09a29bf702754fa2b5 (origin/win-pthreads) +commit e90e7f309b3f2760a01e8e09a29bf702754fa2b5 Author: Field G. Van Zee Date: Thu Oct 25 14:09:43 2018 -0500 CHANGELOG update (0.5.0) -commit be7c57819cfd48adb175d9a480cc9f37928645c1 (tag: 0.5.0) +commit be7c57819cfd48adb175d9a480cc9f37928645c1 Author: Field G. Van Zee Date: Thu Oct 25 14:09:40 2018 -0500 @@ -13446,7 +13224,7 @@ Date: Thu Aug 30 15:14:02 2018 -0500 CHANGELOG update (0.4.1) -commit 10fd614031307c46db3d893528d4e5fc31f490b3 (tag: 0.4.1) +commit 10fd614031307c46db3d893528d4e5fc31f490b3 Author: Field G. Van Zee Date: Thu Aug 30 15:13:59 2018 -0500 @@ -14042,7 +13820,7 @@ Date: Fri Jul 27 16:10:46 2018 -0500 CHANGELOG update (0.4.0) -commit 4ad61ce905d250dd3ef197f0d06a69ce6d99d309 (tag: 0.4.0) +commit 4ad61ce905d250dd3ef197f0d06a69ce6d99d309 Author: Field G. Van Zee Date: Fri Jul 27 16:10:43 2018 -0500 @@ -16233,7 +16011,7 @@ Date: Sat Apr 28 14:07:34 2018 -0500 CHANGELOG update (0.3.2) -commit 2fb440876690bdcec0c11a30e2b33ad100bab529 (tag: 0.3.2) +commit 2fb440876690bdcec0c11a30e2b33ad100bab529 Author: Field G. Van Zee Date: Sat Apr 28 14:07:31 2018 -0500 @@ -16647,7 +16425,7 @@ Date: Wed Apr 4 17:13:15 2018 -0500 CHANGELOG update (0.3.1) -commit 1f28d7c86e17730f05bd239c8e8d67e3e7510a4f (tag: 0.3.1) +commit 1f28d7c86e17730f05bd239c8e8d67e3e7510a4f Author: Field G. Van Zee Date: Wed Apr 4 17:13:15 2018 -0500 @@ -17286,7 +17064,7 @@ Date: Wed Feb 28 15:30:14 2018 -0600 bli_cgemm_zen_asm_3x8() and bli_zgemm_zen_asm_3x4(), in bli_cntx_init_zen.c. This was actually intended for 1681333. -commit 709f8361ebc90b96b02ebe5c5ffb6fc3b1b25e58 (tag: 0.3.0) +commit 709f8361ebc90b96b02ebe5c5ffb6fc3b1b25e58 Author: Field G. Van Zee Date: Fri Feb 23 17:42:48 2018 -0600 @@ -17334,7 +17112,7 @@ Date: Fri Feb 23 16:33:32 2018 -0600 contained. To remedy this situation, we now selectively use movss to load any element that could be the last element in the matrix. -commit 5112e1859e7f8888f5555eb7bc02bd9fab9b4442 (origin/rt) +commit 5112e1859e7f8888f5555eb7bc02bd9fab9b4442 Author: Field G. Van Zee Date: Fri Feb 23 14:31:26 2018 -0600 @@ -17615,7 +17393,7 @@ Date: Sat Dec 23 15:32:03 2017 -0600 is used by the auto-detection script to printf() the name of the sub-configuration corresponding to the detected hardware. -commit 9804adfd405056ec332bb8e13d68c7b52bd3a6c1 (origin/selfinit) +commit 9804adfd405056ec332bb8e13d68c7b52bd3a6c1 Author: Field G. Van Zee Date: Thu Dec 21 19:22:57 2017 -0600 @@ -17752,7 +17530,7 @@ Date: Thu Dec 14 11:27:19 2017 -0600 Merge branch 'master' into selfinit -commit a32e8a47c022b6071302b2956af5728976c83ca9 (origin/travis) +commit a32e8a47c022b6071302b2956af5728976c83ca9 Author: Field G. Van Zee Date: Wed Dec 13 16:31:36 2017 -0600 @@ -20145,7 +19923,7 @@ Date: Tue May 2 16:38:43 2017 -0500 CHANGELOG update (0.2.2) -commit 940a707ac78de975110e17c95765e65b89aa5e10 (tag: 0.2.2) +commit 940a707ac78de975110e17c95765e65b89aa5e10 Author: Field G. Van Zee Date: Tue May 2 16:38:42 2017 -0500 @@ -21165,7 +20943,7 @@ Date: Wed Oct 5 14:41:35 2016 -0500 CHANGELOG update (0.2.1) -commit 866b2dde3f41760121115fb25f096d4344e8b4f9 (tag: 0.2.1) +commit 866b2dde3f41760121115fb25f096d4344e8b4f9 Author: Field G. Van Zee Date: Wed Oct 5 14:41:34 2016 -0500 @@ -21178,7 +20956,7 @@ Date: Wed Oct 5 13:35:01 2016 -0500 Merge branch 'compose' -commit 6f71cd344951854e4cff9ea21bbdfe536e72611d (origin/compose) +commit 6f71cd344951854e4cff9ea21bbdfe536e72611d Merge: c0630c40 8d55033c Author: Field G. Van Zee Date: Tue Oct 4 15:53:46 2016 -0500 @@ -22865,7 +22643,7 @@ Date: Mon Apr 11 17:32:13 2016 -0500 CHANGELOG update (0.2.0) -commit 898614a555ea0aa7de4ca07bb3cb8f5708b6a002 (tag: 0.2.0) +commit 898614a555ea0aa7de4ca07bb3cb8f5708b6a002 Author: Field G. Van Zee Date: Mon Apr 11 17:32:09 2016 -0500 @@ -23678,7 +23456,7 @@ Date: Wed Jul 29 13:31:12 2015 -0500 CHANGELOG update (0.1.8) -commit 47caa33485b91ea6f2a5e386e61210c90c5f489f (tag: 0.1.8) +commit 47caa33485b91ea6f2a5e386e61210c90c5f489f Author: Field G. Van Zee Date: Wed Jul 29 13:31:09 2015 -0500 @@ -23726,7 +23504,7 @@ Date: Fri Jun 19 12:01:50 2015 -0500 CHANGELOG update (0.1.7) -commit 267253de8a7be546ce87626443ee38701c1d411f (tag: 0.1.7) +commit 267253de8a7be546ce87626443ee38701c1d411f Author: Field G. Van Zee Date: Fri Jun 19 12:01:49 2015 -0500 @@ -24541,7 +24319,7 @@ Date: Thu Oct 23 11:35:48 2014 -0500 CHANGELOG update (0.1.6) -commit 38ea5022e4ed846112198c4e1672fcdaeb90dc71 (tag: 0.1.6) +commit 38ea5022e4ed846112198c4e1672fcdaeb90dc71 Author: Field G. Van Zee Date: Thu Oct 23 11:35:45 2014 -0500 @@ -25491,7 +25269,7 @@ Date: Mon Aug 4 16:01:59 2014 -0500 CHANGELOG update (0.1.5) -commit bde56d0ecfd0ec20330fac290b91a6dca0cf94e9 (tag: 0.1.5) +commit bde56d0ecfd0ec20330fac290b91a6dca0cf94e9 Author: Field G. Van Zee Date: Mon Aug 4 16:01:58 2014 -0500 @@ -25640,7 +25418,7 @@ Date: Sun Jul 27 18:20:13 2014 -0500 CHANGELOG update (0.1.4) -commit a7537071b152ecff671f8716595d37dc09e4fd51 (tag: 0.1.4) +commit a7537071b152ecff671f8716595d37dc09e4fd51 Author: Field G. Van Zee Date: Sun Jul 27 18:20:12 2014 -0500 @@ -26039,7 +25817,7 @@ Date: Mon Jun 23 13:48:17 2014 -0500 CHANGELOG update (0.1.3) -commit 036cc634918463b1caa0fd89c9a211f2f5639af7 (tag: 0.1.3) +commit 036cc634918463b1caa0fd89c9a211f2f5639af7 Author: Field G. Van Zee Date: Mon Jun 23 13:48:17 2014 -0500 @@ -26172,7 +25950,7 @@ Date: Thu Jun 5 10:54:16 2014 -0500 CHANGELOG update (for 0.1.2). -commit 00f232f8ed1f7c41619b12ebf779ebe2c3b2d3cd (tag: 0.1.2) +commit 00f232f8ed1f7c41619b12ebf779ebe2c3b2d3cd Author: Tyler Smith Date: Mon Jun 2 13:40:57 2014 -0500 @@ -26800,7 +26578,7 @@ Date: Tue Feb 25 17:58:42 2014 -0600 CHANGELOG update (for 0.1.1). -commit fde5f1fdece19881f50b142e8611b772a647e6d2 (tag: 0.1.1) +commit fde5f1fdece19881f50b142e8611b772a647e6d2 Author: Field G. Van Zee Date: Tue Feb 25 13:34:56 2014 -0600 @@ -27714,7 +27492,7 @@ Date: Mon Nov 11 10:15:40 2013 -0600 CHANGELOG update (for 0.1.0). -commit 089048d5895a30221b6b1976c9be93ad6443420d (tag: 0.1.0) +commit 089048d5895a30221b6b1976c9be93ad6443420d Author: Field G. Van Zee Date: Sat Nov 9 17:18:00 2013 -0600 @@ -28429,7 +28207,7 @@ Date: Fri Jul 19 17:15:03 2013 -0500 CHANGELOG update (for 0.0.9). -commit 0680916fdd532f7a4716b11a2515243b2c08d00f (tag: 0.0.9) +commit 0680916fdd532f7a4716b11a2515243b2c08d00f Author: Field G. Van Zee Date: Thu Jul 18 18:04:34 2013 -0500 @@ -28667,7 +28445,7 @@ Date: Wed Jun 12 16:40:04 2013 -0500 CHANGELOG update. -commit 5b641c3bab31eac6a1795b9f6e3f86c59651ca50 (tag: 0.0.8) +commit 5b641c3bab31eac6a1795b9f6e3f86c59651ca50 Author: Field G. Van Zee Date: Wed Jun 12 16:02:12 2013 -0500 @@ -28854,7 +28632,7 @@ Date: Wed May 1 15:00:30 2013 -0500 CHANGELOG update. -commit 6bfa96f84887dec0b4cf8be5d38dd634c2f8951d (tag: 0.0.7) +commit 6bfa96f84887dec0b4cf8be5d38dd634c2f8951d Author: Field G. Van Zee Date: Tue Apr 30 19:35:54 2013 -0500 @@ -29236,7 +29014,7 @@ Date: Sat Apr 13 16:53:16 2013 -0500 CHANGELOG update. -commit ec16c52f2ecf419c749175ce0a297441c10f1c68 (tag: 0.0.6) +commit ec16c52f2ecf419c749175ce0a297441c10f1c68 Author: Field G. Van Zee Date: Sat Apr 13 16:41:16 2013 -0500 @@ -29546,7 +29324,7 @@ Date: Sun Mar 24 20:18:12 2013 -0500 CHANGELOG update. -commit b65cdc57d9e51fa00e3c03539cfb7e045707d0f4 (tag: 0.0.5) +commit b65cdc57d9e51fa00e3c03539cfb7e045707d0f4 Author: Field G. Van Zee Date: Sun Mar 24 20:01:49 2013 -0500 @@ -29650,7 +29428,7 @@ Date: Mon Mar 18 10:37:03 2013 -0500 CHANGELOG update. -commit e7d41229d3b1674e74f47d7f29fae004a745201a (tag: 0.0.4) +commit e7d41229d3b1674e74f47d7f29fae004a745201a Author: Field G. Van Zee Date: Fri Mar 15 17:12:36 2013 -0500 @@ -29778,7 +29556,7 @@ Date: Fri Feb 22 12:38:45 2013 -0600 configuration directory (bl2_config.h, specifically) given that it can be expected to be tweaked by some developers. -commit ede75693e5a36c6006087c4a7df834175b604504 (tag: 0.0.3) +commit ede75693e5a36c6006087c4a7df834175b604504 Author: Field G. Van Zee Date: Fri Feb 22 12:11:24 2013 -0600 @@ -29988,7 +29766,7 @@ Date: Mon Feb 11 13:38:07 2013 -0600 CHANGELOG update. -commit 768fcebaa8be0eb936a6e7a02cd8a19438c79d99 (tag: 0.0.2) +commit 768fcebaa8be0eb936a6e7a02cd8a19438c79d99 Author: Field G. Van Zee Date: Mon Feb 11 13:20:44 2013 -0600 @@ -30230,7 +30008,7 @@ Date: Mon Dec 10 17:23:32 2012 -0600 Minor updates towards to 0.0.1. -commit 7ad4ebef38b8e6eea9b6091844ba7294ec870271 (tag: 0.0.1) +commit 7ad4ebef38b8e6eea9b6091844ba7294ec870271 Author: Field G. Van Zee Date: Mon Dec 10 16:18:40 2012 -0600 @@ -30298,7 +30076,7 @@ Date: Thu Dec 6 14:27:11 2012 -0600 Wrote first draft of INSTALL file. -commit bcbe81235a35ccfdbcc2f2319a0ca6e04f75a785 (tag: 0.0.0) +commit bcbe81235a35ccfdbcc2f2319a0ca6e04f75a785 Author: Field G. Van Zee Date: Thu Dec 6 12:42:35 2012 -0600