-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add max/min kernel #84
Conversation
kilinchange
commented
Jan 25, 2024
•
edited
Loading
edited
- 前端算子测例
- 前端算子
- 计算图算子
- kernel 测例
- kernel cpu
- kernel cuda
- 模型端到端测试
e2c363d
to
8832a33
Compare
8832a33
to
0d6bdbe
Compare
0d6bdbe
to
3ecfd67
Compare
UNREACHABLE(); | ||
} | ||
|
||
if (broadcaster.needBroadcast()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里没必要判断吧,cpu 走不走坐标计算对性能的影响微乎其微
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
广播与非广播不能合成一个逻辑
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
{dt} const *const addr[{inputsNum}]; | ||
}}; | ||
|
||
__device__ __forceinline__ static {dt} fn({dt} a, {dt} b) {{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
判断大小用大于号小于号和三目运算符不用转型。直接内联进去就填一个大于号小于号就行了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
fca3a9c
to
86a8f34
Compare
没改啊,没推上来? |
86a8f34
to
e237349
Compare
现在改了 |