Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add max/min kernel #84

Merged
merged 3 commits into from
Jan 31, 2024
Merged

add max/min kernel #84

merged 3 commits into from
Jan 31, 2024

Conversation

kilinchange
Copy link
Contributor

@kilinchange kilinchange commented Jan 25, 2024

  • 前端算子测例
  • 前端算子
  • 计算图算子
  • kernel 测例
  • kernel cpu
  • kernel cuda
  • 模型端到端测试

@kilinchange kilinchange added the kernel Add or modify kernel implementation label Jan 25, 2024
@kilinchange kilinchange self-assigned this Jan 25, 2024
@kilinchange kilinchange linked an issue Jan 25, 2024 that may be closed by this pull request
7 tasks
@kilinchange kilinchange force-pushed the add_max_min_kernel branch 3 times, most recently from e2c363d to 8832a33 Compare January 30, 2024 03:19
@kilinchange kilinchange changed the title [WIP] add max/min kernel add max/min kernel Jan 30, 2024
@kilinchange kilinchange requested a review from YdrMaster January 30, 2024 03:19
@kilinchange
Copy link
Contributor Author

image
opt 模型端到端测试通过

UNREACHABLE();
}

if (broadcaster.needBroadcast()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里没必要判断吧,cpu 走不走坐标计算对性能的影响微乎其微

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

广播与非广播不能合成一个逻辑

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

{dt} const *const addr[{inputsNum}];
}};

__device__ __forceinline__ static {dt} fn({dt} a, {dt} b) {{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

判断大小用大于号小于号和三目运算符不用转型。直接内联进去就填一个大于号小于号就行了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@kilinchange kilinchange force-pushed the add_max_min_kernel branch 2 times, most recently from fca3a9c to 86a8f34 Compare January 30, 2024 07:51
@kilinchange kilinchange requested a review from YdrMaster January 30, 2024 07:52
@YdrMaster
Copy link
Collaborator

没改啊,没推上来?

@kilinchange
Copy link
Contributor Author

没改啊,没推上来?

现在改了

@YdrMaster YdrMaster merged commit 35dc6c8 into dev Jan 31, 2024
2 checks passed
@YdrMaster YdrMaster deleted the add_max_min_kernel branch January 31, 2024 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel Add or modify kernel implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Max&&Min
2 participants