Open
Conversation
Greptile SummaryThis PR reorganizes the math library by splitting monolithic architecture-specific implementations into separate files per CPU instruction set (SSE, AVX, AVX2, AVX512, NEON) with runtime dispatching based on CPU features. Key changes:
Critical issue:
Confidence Score: 1/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Application Code] --> B[Dispatch Layer<br/>*_dispatch.cc]
B --> C{Runtime CPU<br/>Feature Detection}
C -->|AVX512 Available| D[AVX512 Implementation<br/>*_avx512.cc<br/>-march=sapphirerapids]
C -->|AVX Available| E[AVX/AVX2 Implementation<br/>*_avx.cc / *_avx2.cc<br/>-march=core-avx2]
C -->|SSE/Baseline| F[SSE Implementation<br/>*_sse.cc<br/>-march=broadwell]
C -->|ARM NEON| G[NEON Implementation<br/>*_neon.cc<br/>-march=armv8-a]
style B fill:#ff9999
style C fill:#ff9999
H[CRITICAL BUG] -.-> B
H[CRITICAL BUG] -.-> C
I[Dispatch files compiled<br/>with AVX512 flags] -.-> H
J[Will crash on<br/>non-AVX512 CPUs] -.-> H
style H fill:#ff0000,color:#fff
style I fill:#ffcccc
style J fill:#ffcccc
Last reviewed commit: 4370619 |
Comment on lines
+39
to
+46
| file(GLOB_RECURSE MATH_FILES_AVX512 | ||
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_dispatch.cc | ||
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_dispatch.c | ||
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_avx512.cc | ||
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_avx512.c | ||
| ${CMAKE_CURRENT_SOURCE_DIR}/math_batch/*_avx512.cc | ||
| ${CMAKE_CURRENT_SOURCE_DIR}/math_batch/*_avx512.c | ||
| ) |
There was a problem hiding this comment.
Dispatch files should NOT be compiled with AVX512 flags. They contain runtime CPU detection logic and must be compiled with baseline flags (like the default broadwell) to run safely on all CPUs. Currently, these dispatch files will crash with illegal instruction errors on non-AVX512 CPUs.
Suggested change
| file(GLOB_RECURSE MATH_FILES_AVX512 | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_dispatch.cc | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_dispatch.c | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_avx512.cc | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_avx512.c | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math_batch/*_avx512.cc | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math_batch/*_avx512.c | |
| ) | |
| file(GLOB_RECURSE MATH_FILES_AVX512 | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_avx512.cc | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_avx512.c | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math_batch/*_avx512.cc | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math_batch/*_avx512.c | |
| ) | |
| file(GLOB_RECURSE MATH_FILES_DISPATCH | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_dispatch.cc | |
| ${CMAKE_CURRENT_SOURCE_DIR}/math/*_dispatch.c | |
| ) |
Comment on lines
+63
to
+69
| foreach(MATH_FILE ${MATH_FILES_AVX512}) | ||
| set_source_files_properties( | ||
| ${MATH_FILE} | ||
| PROPERTIES | ||
| COMPILE_FLAGS "${MATH_MARCH_FLAG_AVX512}" | ||
| ) | ||
| endforeach() |
There was a problem hiding this comment.
Need separate loop for dispatch files with baseline flags
Suggested change
| foreach(MATH_FILE ${MATH_FILES_AVX512}) | |
| set_source_files_properties( | |
| ${MATH_FILE} | |
| PROPERTIES | |
| COMPILE_FLAGS "${MATH_MARCH_FLAG_AVX512}" | |
| ) | |
| endforeach() | |
| foreach(MATH_FILE ${MATH_FILES_AVX512}) | |
| set_source_files_properties( | |
| ${MATH_FILE} | |
| PROPERTIES | |
| COMPILE_FLAGS "${MATH_MARCH_FLAG_AVX512}" | |
| ) | |
| endforeach() | |
| # Dispatch files should use baseline flags for runtime CPU detection | |
| foreach(MATH_FILE ${MATH_FILES_DISPATCH}) | |
| set_source_files_properties( | |
| ${MATH_FILE} | |
| PROPERTIES | |
| COMPILE_FLAGS "-march=broadwell" | |
| ) | |
| endforeach() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
march based reorganization