Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for Apple M1 AArch64 Architecture Processor. #150

Closed
wants to merge 3 commits into from

Conversation

sbehnke
Copy link

@sbehnke sbehnke commented Dec 23, 2020

This is the initial attempt at adding support for the new Apple Silicon M1 cpu to cpu_features. Based on my own laptop I tried to make my best guess for how the ARM features map from the sysctl optional values to the AArch64Features.

hw.optional.floatingpoint: 1
hw.optional.watchpoint: 4
hw.optional.breakpoint: 6
hw.optional.neon: 1
hw.optional.neon_hpfp: 1
hw.optional.neon_fp16: 1
hw.optional.armv8_1_atomics: 1
hw.optional.armv8_crc32: 1
hw.optional.armv8_2_fhm: 1
hw.optional.armv8_2_sha512: 1
hw.optional.armv8_2_sha3: 1
hw.optional.amx_version: 2
hw.optional.ucnormal_mem: 1
hw.optional.arm64: 1

I also tried to map between the variant, revision, part and implementor based on the following sysctl values. I'm not sure if I got them correct, but it is a good first stab at it.

hw.cputype: 16777228
hw.cpusubtype: 2
hw.cpu64bit_capable: 1
hw.cpufamily: 458787763
hw.cpusubfamily: 2

The output I get is:

./list_cpu_features
arch            : aarch64
implementer     : 16777228 (0x100000C)
variant         :   2 (0x02)
part            : 458787763 (0x1B588BB3)
revision        :   2 (0x02)
flags           : asimdfhm,atomics,crc32,fp,fphp,sha3,sha512

This pull request is an effort to address:
#121
gnuradio/volk#428

@google-cla
Copy link

google-cla bot commented Dec 23, 2020

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@sbehnke
Copy link
Author

sbehnke commented Dec 23, 2020

@googlebot I signed it!

@@ -146,7 +156,7 @@ set_property(TARGET cpu_features PROPERTY POSITION_INDEPENDENT_CODE ${BUILD_PIC}
target_include_directories(cpu_features
PUBLIC $<INSTALL_INTERFACE:${CMAKE_INSTALL_INCLUDEDIR}/cpu_features>
)
if(PROCESSOR_IS_X86)
if(PROCESSOR_IS_X86 OR PROCESSOR_IS_AARCH64)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're following the original code here, but it would make more sense to instead do just a single if here since the clauses are just about APPLE, e.g.:

if(APPLE AND (PROCESSOR_IS_X86 OR PROCESSOR_IS_AARCH64))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'm not great with Cmake yet. I'm sure there will be more changes, but currently it passes the build checks so I'll probably make the change locally and wait on it to submit until we get any more feedback.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment here is still valid

@d235j
Copy link

d235j commented Jan 1, 2021

Unfortunately it looks like Apple doesn't expose everything — running list_cpu_features on an M1 Mac in Linux via Docker shows the following:

root@a7cd11a3d26d:/src/cpu_features/build# ./list_cpu_features 
arch            : aarch64
implementer     :   0 (0x00)
variant         :   0 (0x00)
part            :   0 (0x00)
revision        :   0 (0x00)
flags           : aes,asimd,asimddp,asimdfhm,asimdhp,asimdrdm,atomics,cpuid,crc32,dcpop,dit,evtstrm,fcma,flagm,fp,fphp,ilrcpc,jscvt,lrcpc,pmull,sha1,sha2,sha3,sha512,ssbs,uscat

In Parallels virtualization of Ubuntu 20.04 it shows

# ./list_cpu_features 
arch            : aarch64
implementer     :   65 (0x41)
variant         :   0 (0x00)
part            :   0 (0x00)
revision        :   0 (0x00)
flags           : aes,asimd,asimddp,asimdfhm,asimdhp,asimdrdm,atomics,cpuid,crc32, dcpodp,dcpop,dit,evtstrm,fcma,flagm,flagm2,fp,fphp,frint,ilrcpc,jscvt,lrcpc,paca,pacg,pmull,sb,sha1,sha2,sha3,sha512,ssbs,uscat

It might be necessary to store tables of capabilities that match on the hw.cpufamily sysctl instead, pending an improved API from Apple.

LLVM has such a table here (currently covering A7 through A13):
https://github.com/apple/llvm-project/blob/apple/main/llvm/lib/Target/AArch64/AArch64.td#L757

@hjmallon
Copy link

hjmallon commented Jan 7, 2021

  A7 A10 A11 A12 A13  
SExtLoadCVTF32Pattern Y Y Y Y Y  
BccFusion Y Y Y Y Y  
CbzFusion Y Y Y Y Y  
Crypto Y Y Y Y Y  
DisableLatencySchedHeuristic Y Y Y Y Y  
FPARMv8 Y Y Y Y Y  
FuseAES Y Y Y Y Y  
FuseCryptoEOR Y Y Y Y Y  
NEON Y Y Y Y Y  
PerfMon Y Y Y Y Y  
ZCRegMove Y Y Y Y Y  
ZCZeroing Y Y Y Y Y  
ZCZeroingFPWorkaround Y N N N N  
CRC N Y Y Y Y  
LSE N N Y Y Y  
RDM N Y Y Y Y  
PAN N Y Y Y Y  
LOR N Y Y Y Y  
VH N Y Y Y Y  
PsUA0 N N Y Y Y  
PAN_RWV N N Y Y Y  
RAS N N Y Y Y  
CCPP N N Y Y Y  
RCPC N N N Y Y  
PA N N N Y Y  
JS N N N Y Y  
CCIDX N N N Y Y  
ComplxNum N N N Y Y  
DotProd N N N N Y  
NV N N N N Y  
MPAM N N N N Y  
DIT N N N N Y  
TRACEV8_4 N N N N Y  
AM N N N N Y  
SEL2 N N N N Y  
PMU N N N N Y  
TLB_RMI N N N N Y  
FMI N N N N Y  
RPCP_IMMO N N N N Y  
SHA3 N N N N Y  
FullFP16 N N Y Y Y  
FP16FML N N N N Y  
             
HasV8_2aOps N N Y N N  
HasV8_3aOps N N N Y N  
HasV8_4aOps N N N N Y  

This is my digested version of the table from llvm. HasV8_2aOps means supports all of AARCH 8.2 so I have filled in the rows based on those too. I'm not sure how to map the names to the names here though. M1 is based on A14, which may have further differences to A13 that are not in llvm upstream yet.

set(PROCESSOR_IS_X86 TRUE)
elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "^(powerpc|ppc)")
set(PROCESSOR_IS_POWER TRUE)
endif()
endif()

macro(add_cpu_features_headers_and_sources HDRS_LIST_NAME SRCS_LIST_NAME)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been editing things to get this building on M1 (actually Universal) and I had to change the way files are included because generating a project as it stands currently, it'll pick one of the architecture's sets of files to include, but then if you build as universal it obviously does not have the required files for the other architecture. So I changed it locally so that all the files are always included and then added #if per platform in the various platform files.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and there are also issues with including the libraries because it decides based on the host architecture for some of it (unix_based_hardware_detection) so it can fail to build in those cases too.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah CMake universal builds are a bit of a pain. They are 1 build, making universal targets, so anything done at configure time is probably incorrect. I found that you basically have to move all this stuff ot compile time with ifdefs.

Copy link
Collaborator

@Mizux Mizux Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2 cents,
CMAKE_SYSTEM_PROCESSOR is for target system and it's usually part of a cmake cross-toolchain otherwise you have the CMAKE_HOST_SYSTEM_PROCESSOR...
ref: https://cmake.org/cmake/help/latest/variable/CMAKE_SYSTEM_PROCESSOR.html

For universal platform, i don't know if we could/should introduce a "universal" cmake system processor and propagate it accordingly than including everything and using preprocessor everywhere...

Copy link

@ghost ghost Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to build for universal currently means invoking xcodebuild with dual architecture. This is well after cmake has decided which files to include. I know the CMake guys have been doing some work on all this, but its still early days. Think we just have to suck it up for now.

Besides, I've always found writing cross-platform code is far easier when its all visible in a project.

Copy link
Collaborator

@Mizux Mizux Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to build for universal currently means invoking xcodebuild with dual architecture. This is well after cmake has decided which files to include.

I means, instead of adding all files we could simply add files supported by their "universal 2" build, I'm not sure Apple support (mips, ppc, arm, arm64 and x86)...
EDIT: seems there is universal(i386,ppc) and universal2 (x86_64; arm64)
ref: https://developer.apple.com/documentation/xcode/building_a_universal_macos_binary

I know the CMake guys have been doing some work on all this, but its still early days. Think we just have to suck it up for now.

I think you must talk about CMAKE_OSX_ARCHITECTURES

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've tried using that to target a specific target and that's ok, but it does not quite work yet. ie. you can't tell it x86_64;arm64 as it complains. Hence the xcodebuild route.

@Mizux Mizux added cmake CMake related issue enhancement New feature or request labels Jan 14, 2021
@nazgu1
Copy link

nazgu1 commented Jun 15, 2021

There is arm A14 added to llvm if it helps in any way. https://github.com/apple/llvm-project/blob/apple/main/llvm/lib/Target/AArch64/AArch64.td#L757

@machinaut
Copy link

@IanMDay is the universal2 route still the way to go here?

If that's not working, could this patchset be used as a basis for a non-universal2 mac build?

@ghost
Copy link

ghost commented Jul 22, 2021

@IanMDay is the universal2 route still the way to go here?

If that's not working, could this patchset be used as a basis for a non-universal2 mac build?

I've not revisited it since then, sorry.

michaelld added a commit to macports/macports-ports that referenced this pull request Jul 22, 2021
Copy link

@michaelld michaelld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are changes that would make this PR more concise, but I do think it works as-is. I've added a patch to the uhd-devel port in MacPorts that is based on this PR, to get Volk working on Apple ARM64 / M1 in at least a basic sense. This PR is a good starting point if nothing else.

I am not addressing the Universal2 issue being discussed. That's a separate issue to me. I just want support for the single new ARM64 architecture.

set(HOST_ARCHITECTURE "${arch}")
string(TOLOWER ${HOST_ARCHITECTURE} HOST_ARCHITECTURE)
endif()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct location for this code, and at least with CMake 3.21 this code also isn't necessary for Apple ARM64 support; CMAKE_SYSTEM_PROCESSOR is correctly set to arm64 just as you're doing here. IDK about prior CMake, except that the variable CMAKE_SYSTEM_PROCESSOR has been around for a long time and seems to represent what you're trying to do here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, CMake as added the M1 support in 3.19 according to https://cmake.org/cmake/help/latest/release/3.19.html#platforms

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that may be true, still with CMake 3.22.0, I get this build error on an M1 MacBook Pro:

npm ERR! gyp info using node-gyp@8.4.1
npm ERR! gyp info using node@17.2.0 | darwin | arm64
npm ERR! gyp info find Python using Python version 3.9.9 found at "/opt/homebrew/opt/python@3.9/bin/python3.9"
npm ERR! gyp info spawn /opt/homebrew/opt/python@3.9/bin/python3.9
npm ERR! gyp info spawn args [
npm ERR! gyp info spawn args   '/Users/jwatte/source/repos/cloud-instance/node_modules/node-gyp/gyp/gyp_main.py',
npm ERR! gyp info spawn args   'binding.gyp',
npm ERR! gyp info spawn args   '-f',
npm ERR! gyp info spawn args   'make',
npm ERR! gyp info spawn args   '-I',
npm ERR! gyp info spawn args   '/Users/jwatte/source/repos/cloud-instance/node_modules/cpu-features/build/config.gypi',
npm ERR! gyp info spawn args   '-I',
npm ERR! gyp info spawn args   '/Users/jwatte/source/repos/cloud-instance/node_modules/node-gyp/addon.gypi',
npm ERR! gyp info spawn args   '-I',
npm ERR! gyp info spawn args   '/Users/jwatte/Library/Caches/node-gyp/17.2.0/include/node/common.gypi',
npm ERR! gyp info spawn args   '-Dlibrary=shared_library',
npm ERR! gyp info spawn args   '-Dvisibility=default',
npm ERR! gyp info spawn args   '-Dnode_root_dir=/Users/jwatte/Library/Caches/node-gyp/17.2.0',
npm ERR! gyp info spawn args   '-Dnode_gyp_dir=/Users/jwatte/source/repos/cloud-instance/node_modules/node-gyp',
npm ERR! gyp info spawn args   '-Dnode_lib_file=/Users/jwatte/Library/Caches/node-gyp/17.2.0/<(target_arch)/node.lib',
npm ERR! gyp info spawn args   '-Dmodule_root_dir=/Users/jwatte/source/repos/cloud-instance/node_modules/cpu-features',
npm ERR! gyp info spawn args   '-Dnode_engine=v8',
npm ERR! gyp info spawn args   '--depth=.',
npm ERR! gyp info spawn args   '--no-parallel',
npm ERR! gyp info spawn args   '--generator-output',
npm ERR! gyp info spawn args   'build',
npm ERR! gyp info spawn args   '-Goutput_dir=.'
npm ERR! gyp info spawn args ]
npm ERR! gyp info spawn make
npm ERR! gyp info spawn args [ 'BUILDTYPE=Release', '-C', 'build' ]
npm ERR! In file included from /Users/jwatte/source/repos/cloud-instance/node_modules/cpu-features/deps/cpu_features/src/cpuinfo_arm.c:15:
npm ERR! /Users/jwatte/source/repos/cloud-instance/node_modules/cpu-features/deps/cpu_features/include/cpuinfo_arm.h:118:2: error: "Including cpuinfo_arm.h from a non-arm target."
npm ERR! #error "Including cpuinfo_arm.h from a non-arm target."
npm ERR!  ^
npm ERR! 1 error generated.
npm ERR! make[3]: *** [CMakeFiles/cpu_features.dir/src/cpuinfo_arm.c.o] Error 1

elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "^arm")
set(PROCESSOR_IS_ARM TRUE)
elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "^aarch64")
if(HOST_ARCHITECTURE MATCHES "^arm64")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the above code, this chunk can be reverted back to original ... with 2 changes as noted below

@@ -146,7 +146,7 @@ flags : aes,avx,cx16,smx,sse4_1,sse4_2,ssse3
| Android | yes² | yes¹ | yes¹ | yes¹ | N/A |
| iOS | N/A | not yet | not yet | N/A | N/A |
| Linux | yes² | yes¹ | yes¹ | yes¹ | yes¹ |
| MacOs | yes² | N/A | not yet | N/A | no |
| MacOs | yes² | N/A | yes² | N/A | no |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're modifying this line, let's use macOS instead of MacOs

@@ -39,6 +39,10 @@
#define CPU_FEATURES_ARCH_ARM
#endif

#if (defined(__APPLE__) && defined(__arm64__))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I for one would prefer to join this into the next statement, as

#if (defined(__aarch64__) || (defined(__APPLE__) && defined(__arm64__)))

@@ -22,6 +22,35 @@
#include "internal/stack_line_reader.h"
#include "internal/string_view.h"

// The following includes are necessary to provide SSE detections on pre-AVX
// microarchitectures.
#if defined(CPU_FEATURES_OS_DARWIN)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer to join this #if (for CPU_FEATURES_OS_DARWIN) and the next for CPU_FEATURES_OS_DARWIN together into 1 clump. Yes, there are places elsewhere in the code where these are separate, but there is also other code between then. Here, it's simple to join them together.

else()
if(CMAKE_SYSTEM_PROCESSOR MATCHES "^mips")
set(PROCESSOR_IS_MIPS TRUE)
elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "^arm")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swap this and the next elseif entry

set(PROCESSOR_IS_MIPS TRUE)
elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "^arm")
set(PROCESSOR_IS_ARM TRUE)
elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "^aarch64")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change this to (^aarch64)|(arm64)

@@ -146,7 +156,7 @@ set_property(TARGET cpu_features PROPERTY POSITION_INDEPENDENT_CODE ${BUILD_PIC}
target_include_directories(cpu_features
PUBLIC $<INSTALL_INTERFACE:${CMAKE_INSTALL_INCLUDEDIR}/cpu_features>
)
if(PROCESSOR_IS_X86)
if(PROCESSOR_IS_X86 OR PROCESSOR_IS_AARCH64)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment here is still valid

carlesfernandez added a commit to carlesfernandez/gnss-sdr that referenced this pull request Jul 25, 2021
@nodesocket
Copy link

Can we get this merged in please? With the new 14" and 16" MBPs there is a lot more Apple Silicon floating around (including my M1 Max). I am having an issue in a child Node.js npm module that is based off this project. See mscdex/cpu-features#6

@toor1245
Copy link
Contributor

Hi @nodesocket, recently a new layout cpu features was added #194, so we need to fix merge conflicts, plus to all this, I added a new pull request that should receive cpu information regardless of the OS #186

arkivm added a commit to arkivm/cpu_features that referenced this pull request Nov 18, 2021
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
arkivm added a commit to arkivm/cpu_features that referenced this pull request Nov 18, 2021
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
arkivm added a commit to arkivm/cpu_features that referenced this pull request Nov 18, 2021
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
arkivm added a commit to arkivm/cpu_features that referenced this pull request Nov 18, 2021
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
arkivm added a commit to arkivm/cpu_features that referenced this pull request Nov 18, 2021
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
arkivm added a commit to arkivm/cpu_features that referenced this pull request Nov 19, 2021
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
@toor1245
Copy link
Contributor

toor1245 commented Feb 2, 2022

@sbehnke, could you close this PR? if no objections

@sbehnke sbehnke closed this Feb 2, 2022
@gchatelet gchatelet added this to the v0.7.0 milestone Mar 8, 2022
toor1245 pushed a commit to toor1245/cpu_features that referenced this pull request Aug 18, 2022
To add support AARCH64 for other operating systems such as macOS(Apple M1), ios, FreeBSD, Windows has been moved common logic from 
`src/impl_aarch64_linux_or_android.c` to `src/impl_aarch64__base_implementation.inl`, namely:
* Definitions for introspection
* `Aarch64Info` kEmptyAarch64Info field

Removed include "internal/bit_utils.h" from `src/impl_aarch64_linux_or_android.c`, since this include was not used. Also, include `cpuinfo_aarch64` has been removed  from linux implementation and replaced with `impl_aarch64__base_implementation.inl`, this include will be used for all other operating system impl as well

Added a compilation check that matches the base X86 implementation

Refs: google#121
See also: google#150, google#186, google#204
toor1245 pushed a commit to toor1245/cpu_features that referenced this pull request Aug 18, 2022
To add support AARCH64 for other operating systems such as macOS(Apple M1), ios, FreeBSD, Windows has been moved common logic from 
`src/impl_aarch64_linux_or_android.c` to `src/impl_aarch64__base_implementation.inl`, namely:
* Definitions for introspection
* `Aarch64Info` kEmptyAarch64Info field

Removed include "internal/bit_utils.h" from `src/impl_aarch64_linux_or_android.c`, since this include was not used. Also, include `cpuinfo_aarch64` has been removed from linux implementation and replaced with `impl_aarch64__base_implementation.inl`, this include will be used for all other operating systems impl as well

Added a compilation check that matches the base X86 implementation

Refs: google#121
See also: google#150, google#186, google#204
arkivm added a commit to arkivm/cpu_features that referenced this pull request Aug 27, 2023
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
arkivm added a commit to arkivm/cpu_features that referenced this pull request Aug 28, 2023
Completely based on google#150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags
gchatelet added a commit that referenced this pull request Aug 28, 2023
* Add support for Apple M1 AArch64 SoCs

Completely based on #150. Thanks to @sbehnke!
+ Refactoring to accomodate the new source tree
+ Adding more feature flags

* revert minimum version to 3.0

* Update introspection table

* Simplify logic for Apple HAVE_SYSCTLBYNAME

---------

Co-authored-by: Guillaume Chatelet <gchatelet@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple M1 Apple M1 related issues cmake CMake related issue enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.