Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCSD T2_8 w/ just DGEMM #882

Merged
merged 9 commits into from
Oct 11, 2023
Merged

Conversation

jeffhammond
Copy link
Collaborator

@jeffhammond jeffhammond commented Oct 6, 2023

CCSD_T2_8 should never have used transpose/sort and I should have done this 15 years ago, but here we are.

Karol wrote the transpose-free loop version, but it's equivalent to DGEMM. This change uses DGEMM.

For the ICSD/NTS version, I just pulled in the whole routine from CCSD because it's much cleaner.

In single-node testing, this appears to be 20-30% faster (0.4s versus 0.6s for H2O cc-pVQZ).

@jeffhammond jeffhammond self-assigned this Oct 6, 2023
@jeffhammond jeffhammond marked this pull request as ready for review October 6, 2023 13:01
@edoapra edoapra self-assigned this Oct 6, 2023
@edoapra
Copy link
Collaborator

edoapra commented Oct 6, 2023

@jeffhammond the code is not ready for the case when USE_F90_ALLOCATABLE is not set
https://github.com/nwchemgit/nwchem/actions/runs/6431953108/job/17465829018?pr=882

icsd_t2.F:9222:46:

 9222 |      &                         0.5d0*alpha,f_a,h21d,
      |                                              1
Error: Symbol ‘f_a’ at (1) has no IMPLICIT type; did you mean ‘d_a’?
icsd_t2.F:9223:46:

@jeffhammond jeffhammond marked this pull request as draft October 6, 2023 16:54
@jeffhammond
Copy link
Collaborator Author

okay, sorry, i thought i tested without but apparently i didn't clean the environment. i'll fix it.

@edoapra
Copy link
Collaborator

edoapra commented Oct 6, 2023

okay, sorry, i thought i tested without but apparently i didn't clean the environment. i'll fix it.

No problem. My only suggestion is to run github actions in your own fork first, so that you avoid burning github actions' cycyles that could be used by other pulls and/or schedule tests.

@edoapra
Copy link
Collaborator

edoapra commented Oct 11, 2023

@jeffhammond Merge is dangerous. This is what has cause the repository corruption in the past.
You should have rebased, instead.

@edoapra
Copy link
Collaborator

edoapra commented Oct 11, 2023

Could you try the following?

git reset --mixed 6660bc8048c9fe2e084a2afc50d10d8a46995e05
git fetch upstream
git pull upstream master --rebase
git push -f origin ccsd_t2_dgemm

As an alternative, I could do the git push -f myself, if this is OK for you.

@jeffhammond
Copy link
Collaborator Author

I though GitHub update would do rebase. I'll fix it.

@jeffhammond
Copy link
Collaborator Author

Actually feel free to fix it. It's late here.

@edoapra
Copy link
Collaborator

edoapra commented Oct 11, 2023

Actually feel free to fix it. It's late here.

Done.

@jeffhammond
Copy link
Collaborator Author

Do you agree this is ready for review? It seems to pass CI on my end.

@edoapra
Copy link
Collaborator

edoapra commented Oct 11, 2023

Do you agree this is ready for review? It seems to pass CI on my end.

I have added ccsd_kernels.F to USES_BLAS since it calls dgemm.

What about threading control of the threaded BLAS now used by these kernels? Is the number of threads set anywhere?

call util_blas_set_num_threads(1)

@edoapra edoapra marked this pull request as ready for review October 11, 2023 19:12
@jeffhammond
Copy link
Collaborator Author

Do you agree this is ready for review? It seems to pass CI on my end.

I have added ccsd_kernels.F to USES_BLAS since it calls dgemm.

What about threading control of the threaded BLAS now used by these kernels? Is the number of threads set anywhere?

No, I haven't made that change yet, because nobody will ever run that code. I wrote it as a prototype at Intel, and with Intel OpenMP and MKL will figure out the threading thing properly.

I will clean it up eventually, probably to remove the code in ccsd_kernels.F altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants