-
Notifications
You must be signed in to change notification settings - Fork 374
Disable neon dot on unknown chipsets on aarch32 #300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable neon dot on unknown chipsets on aarch32 #300
Conversation
5396889
to
b210c01
Compare
This causes dot product not to be recognized on Samsung Exynos, Samsung Qualcomm, and Pixel Tensor, all which have vendor->unknown. Pixel 6 Samsung S22 Exynos Samsung S23 Qualcomm Pixel Watch I'm thinking a safer fix would be all known unisoc disable dot product, as it looks like future chips will continue to have this problem. (linux kernel bug) |
Thanks for the detailed info. We should be able to add a more fine grained check, since this one is causing issues. CPUINFO doesn't detect the vendor on the problem chips currently, but we can likely add some logic to properly detect UNISOC phones in the current failing cases. It's a little tricky as I don't have easy access to the problem devices - it seems to be specific to certain hardware or system firmware revisions, as it's there are differences in how some of these phones report hardware even among the same model + kernel version. But I was able to narrow it down to conflicting detected unisoc vs spreadtrum manufacturers (unisoc bought spreadtrum, I think?). Maybe we can update the detection logic to allow this conflict and treat it as unisoc when there's a mismatch, instead of bailing out. |
You noted 'Itel A50' Android 14 (Go edition) Looking at the match_t function, it should detect any 3 or 4 digit version in the T series modern cpus dont fill in /proc/cpuinfo and we fallback on getprop e.g. Pixel 4 cpuinfo Pixel 4 getprop The function Trying a Pixel Watch 2, the cpuinfo does not provide 'Hardware' There is not much consistency with getprop A qualcomm Samsung S23 the product exists but is not useful soc works on Samsung S25 So for new arm android devices ro.soc.model would fill in the vendor more completely I'm not sure how to detect your 'unknown' T603 other than try the existing /proc/cpuinfo method There are many T series with Cortex A55 that likely have the issue FWIW I don't have the T603 but do have a T310 |
We are seeing intermittent SIGILL crashes when running neon dot kernels on certain UNISOC-based phones. The previous change (#265) addressed the majority of these crashes, but we are still seeing some crashes on a small subset of hardware running Meta apps.
I tracked this down to a failure of the chipset detection logic in CPUINFO causing the existing logic to not recognize the soc as one of the UNISOC chips that shouldn't run neon dot instructions. While it would be nice to resolve the chipset detection logic issues, it's only happening on a very small subset of devices and I can't repro it on a local Itel A50 (which is one of the affected prod devices). Maybe due to differing firmware or OS images.
To solve it, I've added an additional piece of logic in arm/linux/aarch32-isa.c to disable neon dot instructions on unknown chipsets. Running this internally at Meta for a few weeks has cleared up the remaining crashes on UNISOC devices (zero instances with this patch). This should solve the issue with minimal collateral damage.