-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPC #16
base: main
Are you sure you want to change the base?
SPC #16
Conversation
It looks like it added other people's code to this when I did my rebasing... Damnit... |
No idea why I still receive emails from here after graduation. But, one suggestion for commit msg format: using 50/72 rule. |
ompi/mca/pml/ob1/pml_ob1_isend.c
Outdated
@@ -119,6 +120,16 @@ static inline int mca_pml_ob1_send_inline (const void *buf, size_t count, | |||
rc = mca_bml_base_sendi (bml_btl, &convertor, &match, OMPI_PML_OB1_MATCH_HDR_LEN, | |||
size, MCA_BTL_NO_ORDER, MCA_BTL_DES_FLAGS_PRIORITY | MCA_BTL_DES_FLAGS_BTL_OWNERSHIP, | |||
MCA_PML_OB1_HDR_TYPE_MATCH, NULL); | |||
|
|||
if(OPAL_LIKELY(rc == OPAL_SUCCESS)){ | |||
if(tag >= 0){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be SPC_USER_OR_MPI. I'll fix this.
ompi/mca/pml/ob1/pml_ob1.c
Outdated
@@ -195,6 +196,7 @@ int mca_pml_ob1_add_comm(ompi_communicator_t* comm) | |||
mca_pml_ob1_recv_frag_t *frag, *next_frag; | |||
mca_pml_ob1_comm_proc_t* pml_proc; | |||
mca_pml_ob1_match_hdr_t* hdr; | |||
opal_timer_t usecs = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be surrounded in ifdefs.
it looks like you missed one of the rebase by one commit. f69eeee is your problem I think. |
opal/runtime/Makefile.am
Outdated
@@ -46,3 +46,4 @@ lib@OPAL_LIB_PREFIX@open_pal_la_SOURCES += \ | |||
runtime/opal_cr.c \ | |||
runtime/opal_info_support.c \ | |||
runtime/opal_progress_threads.c | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this.
What do you mean I missed a rebase by one commit, how does one miss a
rebase? The commit you mentioned is my most recent commit.
…On Thu, Dec 14, 2017 at 7:16 PM Arm ***@***.***> wrote:
it looks like you missed one of the rebase by one commit. f69eeee
<f69eeee>
is your problem I think.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#16 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AOpCkbvS7NuakfndS2ZQAVArNpUqwxDAks5tAbpCgaJpZM4RC1hN>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetics
Also do the following.
- Any file you touch, change the copyright of UT to 2017.
- Add yourself to Open MPI author file.
- Break the long line into two.
- Fix the rebase.
ompi/mca/pml/ob1/pml_ob1_sendreq.c
Outdated
@@ -350,6 +356,7 @@ mca_pml_ob1_copy_frag_completion( mca_btl_base_module_t* btl, | |||
* we just abort. In theory, a new queue could be created to hold this | |||
* fragment and then attempt to send it out on another BTL. */ | |||
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_FRAG); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't just add a blank space. Keep it the way it was.
ompi/mca/pml/ob1/pml_ob1_sendreq.c
Outdated
@@ -449,6 +456,7 @@ int mca_pml_ob1_send_request_start_buffered( | |||
|
|||
/* send */ | |||
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_RNDV); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't just add a blank space. Keep it the way it was.
ompi/mca/pml/ob1/pml_ob1_sendreq.c
Outdated
@@ -495,8 +503,10 @@ int mca_pml_ob1_send_request_start_copy( mca_pml_ob1_send_request_t* sendreq, | |||
MCA_BTL_DES_FLAGS_PRIORITY | MCA_BTL_DES_FLAGS_BTL_OWNERSHIP, | |||
MCA_PML_OB1_HDR_TYPE_MATCH, | |||
&des); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't just add a blank space. Keep it the way it was.
ompi/mca/pml/ob1/pml_ob1_sendreq.c
Outdated
@@ -731,6 +749,7 @@ int mca_pml_ob1_send_request_start_rdma( mca_pml_ob1_send_request_t* sendreq, | |||
|
|||
/* send */ | |||
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_RGET); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't just add a blank space. Keep it the way it was.
ompi/mca/pml/ob1/pml_ob1_sendreq.c
Outdated
@@ -811,6 +830,7 @@ int mca_pml_ob1_send_request_start_rndv( mca_pml_ob1_send_request_t* sendreq, | |||
|
|||
/* send */ | |||
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_RNDV); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't just add a blank space. Keep it the way it was.
ompi/mca/pml/ob1/pml_ob1_sendreq.c
Outdated
@@ -1055,6 +1075,7 @@ mca_pml_ob1_send_request_schedule_once(mca_pml_ob1_send_request_t* sendreq) | |||
|
|||
/* initiate send - note that this may complete before the call returns */ | |||
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_FRAG); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't just add a blank space. Keep it the way it was.
ompi/runtime/ompi_spc.c
Outdated
* The array index indicates the SPC index, while the value indicates | ||
* the MPI_T index. | ||
*/ | ||
if(ret != OPAL_ERROR){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The coding convention for if is
if(condition) {
apply everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've gone through and changed this for all of my ifs. I assume this is also the case for loops? So I should have:
for(stuff) {
instead of:
for(stuff){
@davideberius Then the thing between ce822bc and f69eeee should not be there. If this is the cherry-picked branch, I dont see how you cherry-picked other people's commit into here. |
|
The cherry-picking happened before my most recent commit. After my most
recent commit, I did the following:
git pull --rebase ICL master
Then, it said that I had diverged from my online spc branch, so I did this:
git pull --rebase origin spc
…On Thu, Dec 14, 2017 at 7:28 PM Chongxiao Cao ***@***.***> wrote:
@davideberius <https://github.com/davideberius>
- git remote add ICLDisco https://github.com/ICLDisco/ompi.git.
- git rebase -i ICLDisco/master, remove all other people's commits,
leave only yours. Then you cannot miss any commit.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AOpCkf11YVBqEsztSvfRERGxJvr1THnvks5tAb09gaJpZM4RC1hN>
.
|
@davideberius you shouldnt do that. The steps should be
|
So, I'm supposed to create a new branch and cherry-pick everything each and
every time I add a commit to an active pull request?! This seems a little
extreme...
…On Thu, Dec 14, 2017 at 7:40 PM Arm ***@***.***> wrote:
@davideberius <https://github.com/davideberius> you shouldnt do that. The
steps should be
- Sync with master (via rebase or just create new branch from master)
- Cherry pick all your commits.
- create the PR.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AOpCkVnl7bVUAbBHyJdjQCQlAdf4pzRqks5tAcADgaJpZM4RC1hN>
.
|
No sir, when you wanna add a new commit, you just do on this branch without sync with master again. Try to keep the sync point the same for the PR. |
In a nutshell, your mistake was |
So, if I'm supposed to keep the sync point the same, my mistake was
updating with master at all after the cherry-pick? So, I should have just
pushed up my commits without updating with master again. That way I
wouldn't have diverged with my online branch...? Ok, I'll fix it.
…On Thu, Dec 14, 2017 at 7:49 PM Arm ***@***.***> wrote:
In a nutshell, your mistake was
git pull --rebase origin spc if you didnt do this, it would have been ok.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AOpCkYhvxr8Ivurl-hjKkOwMxbTPMXQHks5tAcI3gaJpZM4RC1hN>
.
|
My plan is to do this:
Reset to before I did any rebasing, then:
git pull --rebase ICL master
Verify that all of my commits are on the top of the git log, then:
git push -f
This should fix the issue, right?
On Thu, Dec 14, 2017 at 7:55 PM David Eberius <davideberius@gmail.com>
wrote:
… So, if I'm supposed to keep the sync point the same, my mistake was
updating with master at all after the cherry-pick? So, I should have just
pushed up my commits without updating with master again. That way I
wouldn't have diverged with my online branch...? Ok, I'll fix it.
On Thu, Dec 14, 2017 at 7:49 PM Arm ***@***.***> wrote:
> In a nutshell, your mistake was
> git pull --rebase origin spc if you didnt do this, it would have been ok.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#16 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AOpCkYhvxr8Ivurl-hjKkOwMxbTPMXQHks5tAcI3gaJpZM4RC1hN>
> .
>
|
You can sync to master after cherry-pick, that's fine. Of course you will get the 'diverge from online branch' message because your commit's hash on local branch changed after the rebasing. But to keep it simple, just dont rebase again for now. Let David vs GIT continues. |
If I don't rebase, I'll be a few commits behind master. Is this ok?
…On Thu, Dec 14, 2017 at 8:04 PM Arm ***@***.***> wrote:
You can sync to master after cherry-pick, that's fine. Of course you will
get the 'diverge from online branch' message because your commit's hash on
local branch changed after the rebasing.
But to keep it simple, just dont rebase again for now. Let David vs GIT
continues.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AOpCkWKqGMu4hQpaVK5LDnwnvGb-wSnhks5tAcWrgaJpZM4RC1hN>
.
|
Few commit behind is ok for a PR. If there is conflict then GitHub will tell you and we will deal with that later. Most of the time it's the commit for another module with no conflict to our PR. |
This should work, yes. |
Wait,
you can also do a
git reinforce-branch <https://git-man-page-generator.lokaltog.net/#7d838e463e3e6b7ffa833bd333e47511>
To reinforce the non-exported local branches opposite of a few parsed indices, and in case this does not work, do a
git panic-object <https://git-man-page-generator.lokaltog.net/#7dfea5684f7cd5c5a72faa02bbf15865>
Hope this helps,
Thomas
… On Dec 14, 2017, at 20:16, Arm ***@***.***> wrote:
My plan is to do this:
Reset to before I did any rebasing, then:
git pull --rebase ICL master
Verify that all of my commits are on the top of the git log, then:
git push -f
This should work, yes.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#16 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEpMtr8XrSHPzyI-4AVjW35j7Galu73Pks5tAchcgaJpZM4RC1hN>.
|
opal/threads/wait_sync.c
Outdated
@@ -94,8 +94,10 @@ int ompi_sync_wait_mt(ompi_wait_sync_t *sync) | |||
/* In case I am the progress manager, pass the duties on */ | |||
if( sync == wait_sync_list ) { | |||
wait_sync_list = (sync == sync->next) ? NULL : sync->next; | |||
if( NULL != wait_sync_list ) | |||
if( NULL != wait_sync_list ){ | |||
/* This is a possible placement for an MPI_T progress switch counter */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it is worthwhile to leave this comment in here? If not, I can remove these changes.
opal/threads/wait_sync.c
Outdated
@@ -94,8 +94,10 @@ int ompi_sync_wait_mt(ompi_wait_sync_t *sync) | |||
/* In case I am the progress manager, pass the duties on */ | |||
if( sync == wait_sync_list ) { | |||
wait_sync_list = (sync == sync->next) ? NULL : sync->next; | |||
if( NULL != wait_sync_list ) | |||
if( NULL != wait_sync_list ) { | |||
/* This is a possible placement for an MPI_T progress switch pvar */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is to yourself. Get rid of it for the PR.
@@ -319,7 +319,7 @@ OPAL_DECLSPEC int mca_base_pvar_register (const char *project, const char *frame | |||
* associated with a component. | |||
* | |||
* While quite similar to mca_base_pvar_register(), there is one key | |||
* difference: pvars registered this this function will automatically | |||
* difference: pvars registered with this function will automatically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice catch but I would create a different PR just for this typo as this has nothing to do with SPC. Ask @bosilca .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically @thananon is right. Practically, you will not be able to push a 1 line commit without approval from the devels. So leave it here ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it I shouldn't update the copyright information on this file...?
Dont we have AUTHORS file anymore @bosilca ? I dont know if our MTT is up yet. We should run this PR. |
No more AUTHORS. It is autogenerated from the git log. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is looking good. When ready let's squash it in a single commit, and add some documentation about how to use. If possible have a test or two to check them out.
ompi/mca/pml/ob1/pml_ob1.c
Outdated
continue; | ||
|
||
opal_list_remove_item(&pml_proc->frags_cant_match, (opal_list_item_t*)frag); | ||
goto add_fragment_to_unexpected; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you fail to account the matching time when the goto is taken ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good catch. I moved the start timer to before the add_fragment_to_unexpected: and the stop timer to after the end of the else bracket.
* so we need to convert from MPI_T index to SPC index and then set the 'value' argument | ||
* to the correct value for this pvar. | ||
*/ | ||
static int ompi_spc_get_count(const struct mca_base_pvar_t *pvar, void *value, void *obj_handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The read operation is supposed to be fast and constant in time independent on the number of events to look for. Can we have a translation table instead of looping around all counters to find the corresponding SPC index ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the MPI_T indices are basically the same as the SPC indices with an offset. I could potentially save this offset, but I'm not sure this is safe. Everything in MPI_Init should be serial, right? If so, this method would be safe since we can assume that the indices will be contiguous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be problematic if any of the SPCs fail to be registered properly with MPI_T. I'm not sure how I would implement a proper translation table through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something fails to get registered properly you can just disable the entire component. If you can validate the offset idea that would be really great. In the worst case we might want to reach out to Nathan to see if he considers this as a safe solution.
@@ -319,7 +319,7 @@ OPAL_DECLSPEC int mca_base_pvar_register (const char *project, const char *frame | |||
* associated with a component. | |||
* | |||
* While quite similar to mca_base_pvar_register(), there is one key | |||
* difference: pvars registered this this function will automatically | |||
* difference: pvars registered with this function will automatically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically @thananon is right. Practically, you will not be able to push a 1 line commit without approval from the devels. So leave it here ...
Good job, this looks good and consistent. Can we read these counters via MPI_T ? Can you add a small example for users on how to extract these counters via MPI_T, and one via PAPI SDC. |
@@ -0,0 +1,281 @@ | |||
#include "ompi_spc.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I supposed to add copyright information to the top of the files that I added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. On files where the UTK copyright was not present you should add it. On new files you should only add UTK copyright.
ompi/mpi/c/finalize.c
Outdated
MPI_Barrier(MPI_COMM_WORLD); | ||
skip_dump: | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this whole thing should be a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like SPC_Finalize()
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, however I'm not sure where to put it. Should its own .c file in the MPI directory or in the finalize.c file or..? I can't put it in my driver code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is just the shell for MPI_Finalize(). The real thing is in ompi_mpi_finalize()
call below which is located in ompi/runtime/ompi_mpi_finalize.c
. You can see they finalize other modules there. Just follow them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good.
ompi/mpi/c/finalize.c
Outdated
@@ -23,6 +23,9 @@ | |||
#include "ompi/mpi/c/bindings.h" | |||
#include "ompi/runtime/params.h" | |||
#include "ompi/errhandler/errhandler.h" | |||
#include "ompi/runtime/ompi_spc.h" | |||
|
|||
#include "ompi/runtime/params.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You dont need this, the header is already included 2 lines above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, ok I've fixed that.
… (Note: the progress switch counter is not yet functional). Made the bytes received counters more accurate. The software events code has been moved into the opal/runtime directory. Put all software events functions in macros that become noops when SOFTWARE_EVENTS_ENABLE is not defined. This still needs to be added as an MCA parameter. Made the software counters self-reliant without the PAPI component. Some of the changes have been added directly to MPI_Init and MPI_Finalize. These are temporary until a more robust method is implemented. Added some functions to the software events driver code to allow for registration with the new PAPI sde component.
…API sde driver code. Temporarily removed MPI_T helper functions.
… sent/received should be put into. Added some more protection into the code. Started work on getting rid of the loops in pml_ob1_sendreq.c.
… Moved the bytes sent/received counters to reflect when ompi updates its own internal counters. Cleaned up some unnecessary print statements in the papi sde code.
…nfigure option to enable SPCs in the build. Added an MCA parameter, mpi_spc_enable, for turning on specific counters. Temporarily disabled PAPI sde component code. Changed OPAL_THREAD_ADD64 to OPAL_THREAD_ADD_FETCH_SIZE_T. Re-added out of sequence related counters.
…where multithreaded runs would fail because the spc initialization happened before opal initialization. Made a better way to keep track of timer-based counters. Made the output in MPI_Finalize make more sense in cases where some counters are turned off. Added SPC_FINI() to MPI_Finalize and made some coding style/formatting changes.
…and off dumping SPC counters in MPI_Finalize, and some minor bug fixes.
…with MPI_ANY_TAG to count as BYTES_RECEIVED_MPI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a short text that explains how to add a new counter would be great.
examples/Makefile.include
Outdated
@@ -57,4 +57,5 @@ EXTRA_DIST += \ | |||
examples/oshmem_strided_puts.c \ | |||
examples/oshmem_symmetric_data.c \ | |||
examples/Hello.java \ | |||
examples/Ring.java | |||
examples/Ring.java \ | |||
examples/spc_example.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
examples/spc_example.c
Outdated
@@ -0,0 +1,121 @@ | |||
/* | |||
* Copyright (c) 2018 The University of Tennessee and The University |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
respect the indentation. move everything left by 5 spaces (for -xxxx the end year).
ompi/mca/pml/ob1/pml_ob1_recvfrag.c
Outdated
@@ -313,10 +314,17 @@ mca_pml_ob1_recv_frag_t* | |||
check_cantmatch_for_match(mca_pml_ob1_comm_proc_t *proc) | |||
{ | |||
mca_pml_ob1_recv_frag_t *frag = proc->frags_cant_match; | |||
#if SPC_ENABLE == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changed with @thananon commit, and I don't think there is a valid reason to check how long it takes to handle OOS match time. I more interesting counter would be to know for how long a fragment stayed in the OOS, but I have no idea how to compute this without adding SPC to the OOB recv frag.
ompi/mca/pml/ob1/pml_ob1.c
Outdated
@@ -250,6 +254,8 @@ int mca_pml_ob1_add_comm(ompi_communicator_t* comm) | |||
continue; | |||
} | |||
|
|||
SPC_TIMER_START(OMPI_OOS_MATCH_TIME, &timer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to another of my comments, this counter might not be interesting anymore. In addition it will have a significant impact on performance, as the add and remove of an OOS frag is not extremely fast.
@@ -0,0 +1,501 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add 5 spaces before the text (to account for the end period of the copyright).
ompi/runtime/ompi_spc.c
Outdated
int i, j, rank, world_size, offset; | ||
long long *recv_buffer, *send_buffer; | ||
|
||
MPI_Comm_rank(MPI_COMM_WORLD, &rank); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are inside the MPI library you are not supposed to use any MPI API. Instead fallback to the corresponding ompi function (ompi_comm_size, ompi_comm_rank). Slightly more complicated for send/recv and for collective. As an example your call to gather will become
err = comm->c_coll->coll_gather(send_buffer, OMPI_NUM_COUNTERS, MPI_LONG_LONG,
recv_buffer, OMPI_NUM_COUNTERS, MPI_LONG_LONG,
0, ompi_mpi_comm_world,
ompi_mpi_comm_world.c_coll->coll_gather_module);
ompi/runtime/ompi_spc.c
Outdated
if(recv_buffer[offset+i] == 0) | ||
continue; | ||
if(timer_event[i]) | ||
SPC_CYCLES_TO_USECS(&recv_buffer[offset+i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes more sense to have thins conversion done on each node, just in case they are not identical.
… from the SPC code. Removed the OOS_MATCH_TIME counter.
Ok, I think I have fixed all of the concerns from this review. |
Looking good. go for it. |
👍 |
Sync to OMPI master and provide updates
This is my new SPC branch. This should not have any conflicts with master. I have added a lot of documentation and cleaned up the code. There are also a few bug fixes and quality of life changes. I renamed all of the sw_events stuff to spc for consistency.