Skip to content

Conversation

lizhenneng
Copy link

When using GCC 9 and GCC 12 on the arm64 platform of ubuntu 2004, the command "gcc -mcpu=native -E -v -" fails to detect the correct CPU flags, which results in compilation failures for certain extended instructions, but the correct CPU flags can be obtained by using gcc -march.

Make sure to read the contributing guidelines before submitting a PR

When using GCC 9 and GCC 12 on the arm64 platform of ubuntu 2004,
the command "gcc -mcpu=native -E -v -" fails to detect the correct CPU flags,
which results in compilation failures for certain extended instructions,
but the correct CPU flags can be obtained by using gcc -march.

Signed-off-by: lizhenneng <lizhenneng@kylinos.cn>
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Sep 25, 2025
@taronaeo taronaeo linked an issue Sep 25, 2025 that may be closed by this pull request
@angt
Copy link
Collaborator

angt commented Sep 25, 2025

Hi,

Could you share the output of your test ?

From my understanding, -march=native and -mcpu=native should give the same result only for GCC ≥ 9, but you still need to explicitly check the -mcpu flag.

$ for d in arch cpu; do for r in arch cpu; do echo "use -m$d=native, read -m$r:" && gcc -m$d=native -E -v - 2>&1 </dev/null | grep -o "m$r=[^ ']*"; done; done
use -march=native, read -march:
use -march=native, read -mcpu:
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
use -mcpu=native, read -march:
use -mcpu=native, read -mcpu:
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs

$ gcc --version
gcc (Ubuntu 14.2.0-19ubuntu2) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@lizhenneng
Copy link
Author

Hi,

Could you share the output of your test ?

From my understanding, -march=native and -mcpu=native should give the same result only for GCC ≥ 9, but you still need to explicitly check the -mcpu flag.

$ for d in arch cpu; do for r in arch cpu; do echo "use -m$d=native, read -m$r:" && gcc -m$d=native -E -v - 2>&1 </dev/null | grep -o "m$r=[^ ']*"; done; done
use -march=native, read -march:
use -march=native, read -mcpu:
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
use -mcpu=native, read -march:
use -mcpu=native, read -mcpu:
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs
mcpu=neoverse-v2+crc+sve2-aes+sve2-sha3+nossbs

$ gcc --version
gcc (Ubuntu 14.2.0-19ubuntu2) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

kylin@kylin-pc:~$ gcc -mcpu=native -E -v -
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc_old
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='openKylin 12.3.0-1ok3k0.1' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-12 --program-prefix=aarch64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --disable-libquadmath-support --enable-plugin --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.3.0 (openKylin 12.3.0-1ok3k0.1)
COLLECT_GCC_OPTIONS= '-E' '-v' '-mlittle-endian' '-mabi=lp64' '-march=armv8-a+crypto+crc+lse+fp16+rcpc+rdma+dotprod+sha3+sm4'
/usr/lib/gcc/aarch64-linux-gnu/12/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -march=armv8-a+crypto+crc+lse+fp16+rcpc+rdma+dotprod+sha3+sm4 -fasynchronous-unwind-tables -dumpbase -
ignoring nonexistent directory "/usr/local/include/aarch64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/12/include-fixed"
ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/12/../../../../aarch64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/aarch64-linux-gnu/12/include
/usr/local/include
/usr/include/aarch64-linux-gnu
/usr/include
End of search list.

@angt
Copy link
Collaborator

angt commented Sep 29, 2025

Thanks for the help!
I've opened a PR (#16333) that should fix the build on your version of GCC while working for recent ones too.
Please let me know if it works for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compile bug: Failed to retrive the correct cpu flag on arm64
2 participants