Vector comparisons #2119

willdealtry · 2025-01-14T21:24:29Z

Implement some portable vectorized comparisons

github-actions · 2025-03-04T15:45:14Z

Label error. Requires exactly 1 of: patch, minor, major. Found:

alexowens90 · 2025-03-11T10:26:47Z

cpp/arcticdb/CMakeLists.txt

-        list(APPEND arcticdb_core_libraries ${KERBEROS_LIBRARY})
-        list(APPEND arcticdb_core_includes  ${KERBEROS_INCLUDE_DIR})
+        #list(APPEND arcticdb_core_libraries ${KERBEROS_LIBRARY})
+        #list(APPEND arcticdb_core_includes  ${KERBEROS_INCLUDE_DIR})


alexowens90 · 2025-03-11T10:28:13Z

cpp/arcticdb/CMakeLists.txt

@@ -1097,7 +1105,7 @@ if(${TEST})
            util/test/rapidcheck_string_pool.cpp
            util/test/rapidcheck_main.cpp
            util/test/rapidcheck_lru_cache.cpp
-            version/test/rapidcheck_version_map.cpp)
+            version/test/rapidcheck_version_map.cpp util/test/test_min_max_integer.cpp)


Not needed here, although this code is a good candidate for rapidcheck testing

alexowens90 · 2025-03-11T10:32:24Z

cpp/arcticdb/column_store/block.hpp

-    static_assert(HeaderSize == Align);
-    uint8_t data_[MinSize];
+    static const size_t DataAlignment = 64;
+    static const size_t PadSize = (DataAlignment - (HeaderDataSize % DataAlignment)) % DataAlignment;


mod operator is a no-op, (DataAlignment - (HeaderDataSize % DataAlignment)) < DataAlignment, so (DataAlignment - (HeaderDataSize % DataAlignment)) % DataAlignment == (DataAlignment - (HeaderDataSize % DataAlignment))

alexowens90 · 2025-03-11T10:34:35Z

cpp/arcticdb/column_store/chunked_buffer.cpp

@@ -68,7 +68,7 @@ std::vector<ChunkedBufferImpl<BlockSize>> split(const ChunkedBufferImpl<BlockSiz
 }

 template std::vector<ChunkedBufferImpl<64>> split(const ChunkedBufferImpl<64>& input, size_t nbytes);
-template std::vector<ChunkedBufferImpl<3968>> split(const ChunkedBufferImpl<3968>& input, size_t nbytes);
+template std::vector<ChunkedBufferImpl<4032ul>> split(const ChunkedBufferImpl<4032ul>& input, size_t nbytes);


Was this just too small in error before?

If there's any data that has been written into storage in 3968 byte blocks, will they still be readable without further modifications?

alexowens90 · 2025-03-11T10:35:57Z

cpp/arcticdb/column_store/test/test_column_data_random_accessor.cpp

@@ -19,8 +19,7 @@ class ColumnDataRandomAccessorTest : public testing::Test {
        input_data.resize(n);
        std::iota(input_data.begin(), input_data.end(), 42);
    }
-    // 3968 bytes == 496 int64s per block, so 3 blocks here
-    size_t n{1000};
+    size_t n{1100};


Comment is useful, it's now 504 int64s per block

alexowens90 · 2025-03-11T11:08:32Z

cpp/arcticdb/util/min_max_integer.hpp

+        constexpr size_t lane_count = sizeof(VectorType) / sizeof(T);
+
+        T init_min = std::numeric_limits<T>::max();
+        T init_max = std::numeric_limits<T>::min();


It smells a bit that there are max related variables when only min is being computed and vice versa, but I can't see a way to refactor that removes them without a lot of code duplication

alexowens90 · 2025-03-11T11:09:58Z

cpp/arcticdb/util/min_max_integer.hpp

+
+        VectorType vector_min, vector_max;
+        if constexpr (ComputeMin) {
+            T* lanes = reinterpret_cast<T*>(&vector_min);


I don't think this is needed, vector_min[i] should do the obvious thing
https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html

alexowens90 · 2025-03-11T11:14:48Z

cpp/arcticdb/util/test/test_min_max_integer.cpp

+
+#ifdef HAS_VECTOR_EXTENSIONS
+
+class MinMaxStressTest : public ::testing::Test {


Could move to a benchmark

alexowens90 · 2025-03-11T11:15:38Z

cpp/arcticdb/util/test/test_min_max_integer.cpp

There's a lot of duplication in here, could we parametrize across supported types, and then move the random tests to rapidcheck?

alexowens90 · 2025-03-11T11:25:37Z

cpp/arcticdb/util/min_max_float.hpp

+    return FloatExtremumFinder<T, FloatMaxComparator<T>>::find(data, n);
+}
+
+#else


Same comments below the #else as for min_max_integer.hpp

willdealtry changed the title ~~Vector min max~~ Vector comparisons Jan 14, 2025

willdealtry force-pushed the vector_min_max branch from df5ab5d to 5e46644 Compare March 4, 2025 15:45

willdealtry marked this pull request as ready for review March 4, 2025 16:16

willdealtry requested review from alexowens90 and poodlewars as code owners March 4, 2025 16:16

willdealtry added 5 commits March 5, 2025 22:41

Refactor aggregator set data, add statistics

aefdffb

Vectorized min and max

16f22a1

test file

06046cc

Make windows great again

8ee1854

Fix cpp tests

ecf54d5

willdealtry force-pushed the vector_min_max branch from f4a95ea to ecf54d5 Compare March 5, 2025 22:41

willdealtry added 3 commits March 6, 2025 12:16

More tests

a21f35b

Don't run stress tests on Windows

75ed893

Wrong test

239e9e4

alexowens90 reviewed Mar 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vector comparisons #2119

Vector comparisons #2119

willdealtry commented Jan 14, 2025

github-actions bot commented Mar 4, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025

alexowens90 Mar 11, 2025


		#ifdef HAS_VECTOR_EXTENSIONS

		class MinMaxStressTest : public ::testing::Test {

Vector comparisons #2119

Are you sure you want to change the base?

Vector comparisons #2119

Conversation

willdealtry commented Jan 14, 2025

github-actions bot commented Mar 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment