Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement specialized templates for pow<2>, pow<3>, pow<4>. #44

Open
jeff-cohere opened this issue Sep 16, 2020 · 3 comments
Open

Implement specialized templates for pow<2>, pow<3>, pow<4>. #44

jeff-cohere opened this issue Sep 16, 2020 · 3 comments
Labels
enhancement New feature or request testing

Comments

@jeff-cohere
Copy link
Contributor

Is your feature request related to a problem? Please describe.
In order to make bit-for-bit testing easier, it would be nice to have specialized implementations of the pow function for low integer exponents. In particular, see this conversation.

Describe the solution you'd like
C++ template specializations for pow<2>, pow<3>, and pow<4>. Perhaps we should include some Fortran support for these and other functions in EKAT as well.

@jeff-cohere jeff-cohere added enhancement New feature or request testing labels Sep 16, 2020
@bartgol
Copy link
Contributor

bartgol commented Sep 17, 2020

We can probably do a template utility for the generic pow<N> (log2(N) recursions). It should be fairly straightforward.

@jeff-cohere
Copy link
Contributor Author

Do you have any HOMME code or other prior art we can use? Or do you have a new implementation in mind?

@bartgol
Copy link
Contributor

bartgol commented Sep 17, 2020

I have an impl for a runtime version; should be immediate to convert to templated (or even add both).

Btw, the bfb_pow_impl function in that file is, imho, a better solution for bfb pow than bridging F90 to Cuda. One might argue that it is expensive, but it might be a wash with the Cuda kernel launch (I never checked though).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request testing
Projects
None yet
Development

No branches or pull requests

2 participants