We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在如下面图片提到的函数adv_fill_vhalo中出现了多个循环嵌套进行类似矩阵运算的情况。也许可以访存优化,也许可以并行展开,我觉得是个不错的上手地方。(如果这个矩阵规模比较小,也许需要慎重考虑并行开销,有时候也会用一些花活,比如更加靠近汇编的调用寄存器,当然 这有点太花了,先尝试一些基础的优化吧)
adv_fill_vhalo
The text was updated successfully, but these errors were encountered:
BeverlyCrl
No branches or pull requests
在如下面图片提到的函数
adv_fill_vhalo
中出现了多个循环嵌套进行类似矩阵运算的情况。也许可以访存优化,也许可以并行展开,我觉得是个不错的上手地方。(如果这个矩阵规模比较小,也许需要慎重考虑并行开销,有时候也会用一些花活,比如更加靠近汇编的调用寄存器,当然 这有点太花了,先尝试一些基础的优化吧)The text was updated successfully, but these errors were encountered: