Skip to content

Conversation

angt
Copy link
Collaborator

@angt angt commented Sep 29, 2025

This is related to the PR #16239

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
@kevinzs2048
Copy link

kevinzs2048 commented Sep 29, 2025

Thanks for working on this, I can still meet that issue on my Debian 12 VM in Mac M4 Pro after adpot your patch, GCC 12.2:

[ 16%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/arch/arm/repack.cpp.o
In file included from /root/llama.cpp/ggml/src/./ggml-impl.h:24,
                 from /root/llama.cpp/ggml/src/ggml-cpu/vec.h:5,
                 from /root/llama.cpp/ggml/src/ggml-cpu/vec.cpp:1:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h: In function ‘void ggml_vec_dot_f16(int, float*, size_t, ggml_fp16_t*, size_t, ggml_fp16_t*, size_t, int)’:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29182:1: error: inlining failed in call to ‘always_inline’ ‘float16x8_t vfmaq_f16(float16x8_t, float16x8_t, float16x8_t)’: target specific option mismatch
29182 | vfmaq_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c)
      | ^~~~~~~~~
In file included from /root/llama.cpp/ggml/src/ggml-cpu/vec.h:6:
/root/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h:365:46: note: called from here
  365 |     #define GGML_F16x8_FMA(a, b, c) vfmaq_f16(a, b, c)
      |                                     ~~~~~~~~~^~~~~~~~~
/root/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h:392:41: note: in expansion of macro ‘GGML_F16x8_FMA’
  392 |     #define GGML_F16_VEC_FMA            GGML_F16x8_FMA
      |                                         ^~~~~~~~~~~~~~
/root/llama.cpp/ggml/src/ggml-cpu/vec.cpp:316:26: note: in expansion of macro ‘GGML_F16_VEC_FMA’
  316 |                 sum[j] = GGML_F16_VEC_FMA(sum[j], ax[j], ay[j]);
      |                          ^~~~~~~~~~~~~~~~
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29182:1: error: inlining failed in call to ‘always_inline’ ‘float16x8_t vfmaq_f16(float16x8_t, float16x8_t, float16x8_t)’: target specific option mismatch
29182 | vfmaq_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c)
      | ^~~~~~~~~

The cmake config output:

root@llm-test:~/llama.cpp# cmake -B build
-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is GNU 12.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMAKE_BUILD_TYPE=Release
-- Found Git: /usr/bin/git (found version "2.39.5")
-- The ASM compiler identification is GNU
-- Found assembler: /usr/bin/cc
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: aarch64
-- GGML_SYSTEM_ARCH: ARM
-- Including CPU backend
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- ARM detected
-- Performing Test GGML_COMPILER_SUPPORTS_FP16_FORMAT_I3E
-- Performing Test GGML_COMPILER_SUPPORTS_FP16_FORMAT_I3E - Failed
-- ARM detected flags: -march=armv8-a+crypto+crc+lse+rcpc+rdma+dotprod+fp16fml+sb+sve2+flagm+pauth
-- Performing Test GGML_MACHINE_SUPPORTS_dotprod
-- Performing Test GGML_MACHINE_SUPPORTS_dotprod - Failed
-- Performing Test GGML_MACHINE_SUPPORTS_nodotprod
-- Performing Test GGML_MACHINE_SUPPORTS_nodotprod - Success
-- Performing Test GGML_MACHINE_SUPPORTS_i8mm
-- Performing Test GGML_MACHINE_SUPPORTS_i8mm - Failed
-- Performing Test GGML_MACHINE_SUPPORTS_noi8mm
-- Performing Test GGML_MACHINE_SUPPORTS_noi8mm - Success
-- Performing Test GGML_MACHINE_SUPPORTS_sve
-- Performing Test GGML_MACHINE_SUPPORTS_sve - Failed
-- Performing Test GGML_MACHINE_SUPPORTS_nosve
-- Performing Test GGML_MACHINE_SUPPORTS_nosve - Success
-- Performing Test GGML_MACHINE_SUPPORTS_sme
-- Performing Test GGML_MACHINE_SUPPORTS_sme - Failed
-- Performing Test GGML_MACHINE_SUPPORTS_nosme
-- Performing Test GGML_MACHINE_SUPPORTS_nosme - Failed
-- ARM feature FMA enabled
-- ARM feature FP16_VECTOR_ARITHMETIC enabled
-- Adding CPU backend variant ggml-cpu: -march=armv8-a+crypto+crc+lse+rcpc+rdma+dotprod+fp16fml+sb+sve2+flagm+pauth+nodotprod+noi8mm+nosve
-- ggml version: 0.9.0-dev
-- ggml commit:  d12f6df1
-- Found CURL: /usr/lib/aarch64-linux-gnu/libcurl.so (found version "7.88.1")
-- Configuring done
-- Generating done
-- Build files have been written to: /root/llama.cpp/build

@angt
Copy link
Collaborator Author

angt commented Sep 29, 2025

At least the code works as expected 😅

But yes, something is really wrong here too:

-- ARM detected flags: -march=armv8-a+crypto+crc+lse+rcpc+rdma+dotprod+fp16fml+sb+sve2+flagm+pauth
-- Performing Test GGML_MACHINE_SUPPORTS_dotprod
-- Performing Test GGML_MACHINE_SUPPORTS_dotprod - Failed
-- Performing Test GGML_MACHINE_SUPPORTS_nodotprod
-- Performing Test GGML_MACHINE_SUPPORTS_nodotprod - Success
...
-- Adding CPU backend variant ggml-cpu: -march=armv8-a+crypto+crc+lse+rcpc+rdma+dotprod+fp16fml+sb+sve2+flagm+pauth+nodotprod+noi8mm+nosve

I dont think it's related to this PR, this is another issue to me.

Could you share the output of these commands:

curl -o feat https://github.com/angt/target-features/releases/download/v5/aarch64-macos-target-features
chmod +x feat
./feat

and

sysctl -a | grep FEAT

on the M4 and inside the VM:

curl -o feat https://github.com/angt/target-features/releases/download/v5/aarch64-linux-target-features
chmod +x feat
./feat

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Sep 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants