Skip to content

Conversation

yegroup001
Copy link

This update expands ARM architecture support by adding NEON and SVE instructions tests, and resolves problems in LLC perf counter configuration and ISA detection. Validation was performed on both x86 (Intel Core i5-8365U/Debian Trixie with AVX2) and ARM platforms (OrangePi AI Pro/Ubuntu 22.04 with SVE), confirming functional correctness across architectures.

@jiegec
Copy link
Owner

jiegec commented Mar 18, 2025

Nice work, comments below.

// hisilicon
tsv110,

tsv200m,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any references to this model? If not, I'd better not include this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know hisilicon does not release any offical specs, so I'm not very sure. But it refers the micro-architecture on my OrangePi AI Pro.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but please avoid non-public architecture names.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I use the SKU to refer to the architecture? If not set, there might be some problems for users using this hisilicon chip.

message('Got CXXFLAGS:', cpp_args)
endif

if cpu == 'x86_64'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we already print CXXFLAGS above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it's for ISA detection.

cpp_args: ['-DAVX2', '-mavx2'],
link_with: utils,
install: true)
if avx2_support
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just build these binaries without checking if the cpu actually supports it? It allows us to build them on one machine, and run them on another.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all right. I'm just afraid that it may cause confuse.

extern void elimination(FILE *fp);
int main(int argc, char *argv[]) {
FILE *fp = fopen("elimination.csv", "w");
FILE *fp = fopen("../run_results/elimination.csv", "w");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this path come from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it's my fault not fixing it before my PR...

@@ -0,0 +1,314 @@
#include "include/utils.h"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition, but actually I have done this in my private repo... Let me upstream those code.

src/uarch.cpp Outdated
fprintf(stderr, "Found CPU family %d, model %d, implementer %d, part %d\n",
family, model, implementer, part);

if(avx2){
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please format code with clang-format.

src/uarch.cpp Outdated
} else if (implementer == 0x48 && part == 0xd01) {
fprintf(stderr, "Hisilicon TSV110 detected\n");
return tsv110;
} else if (implementer == 0x00 && part == 0xd02) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 0x00 implementer looks suspicious...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants