Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPC #16

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

SPC #16

wants to merge 12 commits into from

Conversation

davideberius
Copy link
Collaborator

This is my new SPC branch. This should not have any conflicts with master. I have added a lot of documentation and cleaned up the code. There are also a few bug fixes and quality of life changes. I renamed all of the sw_events stuff to spc for consistency.

@davideberius
Copy link
Collaborator Author

It looks like it added other people's code to this when I did my rebasing... Damnit...

@shawnccx
Copy link

shawnccx commented Dec 15, 2017

No idea why I still receive emails from here after graduation.

But, one suggestion for commit msg format: using 50/72 rule.

@@ -119,6 +120,16 @@ static inline int mca_pml_ob1_send_inline (const void *buf, size_t count,
rc = mca_bml_base_sendi (bml_btl, &convertor, &match, OMPI_PML_OB1_MATCH_HDR_LEN,
size, MCA_BTL_NO_ORDER, MCA_BTL_DES_FLAGS_PRIORITY | MCA_BTL_DES_FLAGS_BTL_OWNERSHIP,
MCA_PML_OB1_HDR_TYPE_MATCH, NULL);

if(OPAL_LIKELY(rc == OPAL_SUCCESS)){
if(tag >= 0){
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be SPC_USER_OR_MPI. I'll fix this.

@@ -195,6 +196,7 @@ int mca_pml_ob1_add_comm(ompi_communicator_t* comm)
mca_pml_ob1_recv_frag_t *frag, *next_frag;
mca_pml_ob1_comm_proc_t* pml_proc;
mca_pml_ob1_match_hdr_t* hdr;
opal_timer_t usecs = 0;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be surrounded in ifdefs.

@thananon
Copy link
Collaborator

it looks like you missed one of the rebase by one commit. f69eeee is your problem I think.

@@ -46,3 +46,4 @@ lib@OPAL_LIB_PREFIX@open_pal_la_SOURCES += \
runtime/opal_cr.c \
runtime/opal_info_support.c \
runtime/opal_progress_threads.c

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this.

@davideberius
Copy link
Collaborator Author

davideberius commented Dec 15, 2017 via email

Copy link
Collaborator

@thananon thananon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cosmetics
Also do the following.

  • Any file you touch, change the copyright of UT to 2017.
  • Add yourself to Open MPI author file.
  • Break the long line into two.
  • Fix the rebase.

@@ -350,6 +356,7 @@ mca_pml_ob1_copy_frag_completion( mca_btl_base_module_t* btl,
* we just abort. In theory, a new queue could be created to hold this
* fragment and then attempt to send it out on another BTL. */
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_FRAG);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just add a blank space. Keep it the way it was.

@@ -449,6 +456,7 @@ int mca_pml_ob1_send_request_start_buffered(

/* send */
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_RNDV);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just add a blank space. Keep it the way it was.

@@ -495,8 +503,10 @@ int mca_pml_ob1_send_request_start_copy( mca_pml_ob1_send_request_t* sendreq,
MCA_BTL_DES_FLAGS_PRIORITY | MCA_BTL_DES_FLAGS_BTL_OWNERSHIP,
MCA_PML_OB1_HDR_TYPE_MATCH,
&des);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just add a blank space. Keep it the way it was.

@@ -731,6 +749,7 @@ int mca_pml_ob1_send_request_start_rdma( mca_pml_ob1_send_request_t* sendreq,

/* send */
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_RGET);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just add a blank space. Keep it the way it was.

@@ -811,6 +830,7 @@ int mca_pml_ob1_send_request_start_rndv( mca_pml_ob1_send_request_t* sendreq,

/* send */
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_RNDV);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just add a blank space. Keep it the way it was.

@@ -1055,6 +1075,7 @@ mca_pml_ob1_send_request_schedule_once(mca_pml_ob1_send_request_t* sendreq)

/* initiate send - note that this may complete before the call returns */
rc = mca_bml_base_send(bml_btl, des, MCA_PML_OB1_HDR_TYPE_FRAG);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just add a blank space. Keep it the way it was.

* The array index indicates the SPC index, while the value indicates
* the MPI_T index.
*/
if(ret != OPAL_ERROR){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coding convention for if is
if(condition) {
apply everywhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've gone through and changed this for all of my ifs. I assume this is also the case for loops? So I should have:
for(stuff) {
instead of:
for(stuff){

@thananon
Copy link
Collaborator

@davideberius Then the thing between ce822bc and f69eeee should not be there. If this is the cherry-picked branch, I dont see how you cherry-picked other people's commit into here.

@shawnccx
Copy link

shawnccx commented Dec 15, 2017

@davideberius

  • git remote add ICLDisco https://github.com/ICLDisco/ompi.git.
  • git fetch ICLDisco.
  • git rebase -i ICLDisco/master, remove all other people's commits, leave only yours. Then you cannot miss any commit.

@davideberius
Copy link
Collaborator Author

davideberius commented Dec 15, 2017 via email

@thananon
Copy link
Collaborator

@davideberius you shouldnt do that. The steps should be

  • Sync with master (via rebase or just create new branch from master)
  • Cherry pick all your commits.
  • create the PR.

@davideberius
Copy link
Collaborator Author

davideberius commented Dec 15, 2017 via email

@thananon
Copy link
Collaborator

No sir, when you wanna add a new commit, you just do on this branch without sync with master again. Try to keep the sync point the same for the PR.

@thananon
Copy link
Collaborator

In a nutshell, your mistake was
git pull --rebase origin spc if you didnt do this, it would have been ok.

@davideberius
Copy link
Collaborator Author

davideberius commented Dec 15, 2017 via email

@davideberius
Copy link
Collaborator Author

davideberius commented Dec 15, 2017 via email

@thananon
Copy link
Collaborator

You can sync to master after cherry-pick, that's fine. Of course you will get the 'diverge from online branch' message because your commit's hash on local branch changed after the rebasing.

But to keep it simple, just dont rebase again for now. Let David vs GIT continues.

@davideberius
Copy link
Collaborator Author

davideberius commented Dec 15, 2017 via email

@thananon
Copy link
Collaborator

Few commit behind is ok for a PR. If there is conflict then GitHub will tell you and we will deal with that later. Most of the time it's the commit for another module with no conflict to our PR.

@thananon
Copy link
Collaborator

My plan is to do this:
Reset to before I did any rebasing, then:
git pull --rebase ICL master
Verify that all of my commits are on the top of the git log, then:
git push -f

This should work, yes.

@therault
Copy link

therault commented Dec 15, 2017 via email

@@ -94,8 +94,10 @@ int ompi_sync_wait_mt(ompi_wait_sync_t *sync)
/* In case I am the progress manager, pass the duties on */
if( sync == wait_sync_list ) {
wait_sync_list = (sync == sync->next) ? NULL : sync->next;
if( NULL != wait_sync_list )
if( NULL != wait_sync_list ){
/* This is a possible placement for an MPI_T progress switch counter */
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it is worthwhile to leave this comment in here? If not, I can remove these changes.

@@ -94,8 +94,10 @@ int ompi_sync_wait_mt(ompi_wait_sync_t *sync)
/* In case I am the progress manager, pass the duties on */
if( sync == wait_sync_list ) {
wait_sync_list = (sync == sync->next) ? NULL : sync->next;
if( NULL != wait_sync_list )
if( NULL != wait_sync_list ) {
/* This is a possible placement for an MPI_T progress switch pvar */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is to yourself. Get rid of it for the PR.

@@ -319,7 +319,7 @@ OPAL_DECLSPEC int mca_base_pvar_register (const char *project, const char *frame
* associated with a component.
*
* While quite similar to mca_base_pvar_register(), there is one key
* difference: pvars registered this this function will automatically
* difference: pvars registered with this function will automatically
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice catch but I would create a different PR just for this typo as this has nothing to do with SPC. Ask @bosilca .

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically @thananon is right. Practically, you will not be able to push a 1 line commit without approval from the devels. So leave it here ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it I shouldn't update the copyright information on this file...?

@thananon
Copy link
Collaborator

Dont we have AUTHORS file anymore @bosilca ?
@davideberius You missed some of the copyright header (change UT to 2017).

I dont know if our MTT is up yet. We should run this PR.

@bosilca
Copy link

bosilca commented Dec 16, 2017

No more AUTHORS. It is autogenerated from the git log.

Copy link

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is looking good. When ready let's squash it in a single commit, and add some documentation about how to use. If possible have a test or two to check them out.

continue;

opal_list_remove_item(&pml_proc->frags_cant_match, (opal_list_item_t*)frag);
goto add_fragment_to_unexpected;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you fail to account the matching time when the goto is taken ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. I moved the start timer to before the add_fragment_to_unexpected: and the stop timer to after the end of the else bracket.

* so we need to convert from MPI_T index to SPC index and then set the 'value' argument
* to the correct value for this pvar.
*/
static int ompi_spc_get_count(const struct mca_base_pvar_t *pvar, void *value, void *obj_handle)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read operation is supposed to be fast and constant in time independent on the number of events to look for. Can we have a translation table instead of looping around all counters to find the corresponding SPC index ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the MPI_T indices are basically the same as the SPC indices with an offset. I could potentially save this offset, but I'm not sure this is safe. Everything in MPI_Init should be serial, right? If so, this method would be safe since we can assume that the indices will be contiguous.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be problematic if any of the SPCs fail to be registered properly with MPI_T. I'm not sure how I would implement a proper translation table through.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something fails to get registered properly you can just disable the entire component. If you can validate the offset idea that would be really great. In the worst case we might want to reach out to Nathan to see if he considers this as a safe solution.

@@ -319,7 +319,7 @@ OPAL_DECLSPEC int mca_base_pvar_register (const char *project, const char *frame
* associated with a component.
*
* While quite similar to mca_base_pvar_register(), there is one key
* difference: pvars registered this this function will automatically
* difference: pvars registered with this function will automatically
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically @thananon is right. Practically, you will not be able to push a 1 line commit without approval from the devels. So leave it here ...

@bosilca
Copy link

bosilca commented Dec 16, 2017

Good job, this looks good and consistent. Can we read these counters via MPI_T ? Can you add a small example for users on how to extract these counters via MPI_T, and one via PAPI SDC.

@@ -0,0 +1,281 @@
#include "ompi_spc.h"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I supposed to add copyright information to the top of the files that I added?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. On files where the UTK copyright was not present you should add it. On new files you should only add UTK copyright.

MPI_Barrier(MPI_COMM_WORLD);
skip_dump:
#endif

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole thing should be a separate function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like SPC_Finalize() or something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, however I'm not sure where to put it. Should its own .c file in the MPI directory or in the finalize.c file or..? I can't put it in my driver code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is just the shell for MPI_Finalize(). The real thing is in ompi_mpi_finalize() call below which is located in ompi/runtime/ompi_mpi_finalize.c. You can see they finalize other modules there. Just follow them.

Copy link

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good.

@@ -23,6 +23,9 @@
#include "ompi/mpi/c/bindings.h"
#include "ompi/runtime/params.h"
#include "ompi/errhandler/errhandler.h"
#include "ompi/runtime/ompi_spc.h"

#include "ompi/runtime/params.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You dont need this, the header is already included 2 lines above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, ok I've fixed that.

David Eberius added 11 commits March 1, 2018 14:11
… (Note: the progress switch counter is not yet functional). Made the bytes received counters more accurate. The software events code has been moved into the opal/runtime directory. Put all software events functions in macros that become noops when SOFTWARE_EVENTS_ENABLE is not defined. This still needs to be added as an MCA parameter. Made the software counters self-reliant without the PAPI component. Some of the changes have been added directly to MPI_Init and MPI_Finalize. These are temporary until a more robust method is implemented. Added some functions to the software events driver code to allow for registration with the new PAPI sde component.
…API sde driver code. Temporarily removed MPI_T helper functions.
… sent/received should be put into. Added some more protection into the code. Started work on getting rid of the loops in pml_ob1_sendreq.c.
… Moved the bytes sent/received counters to reflect when ompi updates its own internal counters. Cleaned up some unnecessary print statements in the papi sde code.
…nfigure option to enable SPCs in the build. Added an MCA parameter, mpi_spc_enable, for turning on specific counters. Temporarily disabled PAPI sde component code. Changed OPAL_THREAD_ADD64 to OPAL_THREAD_ADD_FETCH_SIZE_T. Re-added out of sequence related counters.
…where multithreaded runs would fail because the spc initialization happened before opal initialization. Made a better way to keep track of timer-based counters. Made the output in MPI_Finalize make more sense in cases where some counters are turned off. Added SPC_FINI() to MPI_Finalize and made some coding style/formatting changes.
…and off dumping SPC counters in MPI_Finalize, and some minor bug fixes.
…with MPI_ANY_TAG to count as BYTES_RECEIVED_MPI.
Copy link

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a short text that explains how to add a new counter would be great.

@@ -57,4 +57,5 @@ EXTRA_DIST += \
examples/oshmem_strided_puts.c \
examples/oshmem_symmetric_data.c \
examples/Hello.java \
examples/Ring.java
examples/Ring.java \
examples/spc_example.c
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@@ -0,0 +1,121 @@
/*
* Copyright (c) 2018 The University of Tennessee and The University
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

respect the indentation. move everything left by 5 spaces (for -xxxx the end year).

@@ -313,10 +314,17 @@ mca_pml_ob1_recv_frag_t*
check_cantmatch_for_match(mca_pml_ob1_comm_proc_t *proc)
{
mca_pml_ob1_recv_frag_t *frag = proc->frags_cant_match;
#if SPC_ENABLE == 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changed with @thananon commit, and I don't think there is a valid reason to check how long it takes to handle OOS match time. I more interesting counter would be to know for how long a fragment stayed in the OOS, but I have no idea how to compute this without adding SPC to the OOB recv frag.

@@ -250,6 +254,8 @@ int mca_pml_ob1_add_comm(ompi_communicator_t* comm)
continue;
}

SPC_TIMER_START(OMPI_OOS_MATCH_TIME, &timer);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to another of my comments, this counter might not be interesting anymore. In addition it will have a significant impact on performance, as the add and remove of an OOS frag is not extremely fast.

@@ -0,0 +1,501 @@
/*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add 5 spaces before the text (to account for the end period of the copyright).

int i, j, rank, world_size, offset;
long long *recv_buffer, *send_buffer;

MPI_Comm_rank(MPI_COMM_WORLD, &rank);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are inside the MPI library you are not supposed to use any MPI API. Instead fallback to the corresponding ompi function (ompi_comm_size, ompi_comm_rank). Slightly more complicated for send/recv and for collective. As an example your call to gather will become

    err = comm->c_coll->coll_gather(send_buffer, OMPI_NUM_COUNTERS, MPI_LONG_LONG,
                                    recv_buffer, OMPI_NUM_COUNTERS, MPI_LONG_LONG,
                                    0, ompi_mpi_comm_world,
                                    ompi_mpi_comm_world.c_coll->coll_gather_module);

if(recv_buffer[offset+i] == 0)
continue;
if(timer_event[i])
SPC_CYCLES_TO_USECS(&recv_buffer[offset+i]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes more sense to have thins conversion done on each node, just in case they are not identical.

… from the SPC code. Removed the OOS_MATCH_TIME counter.
@davideberius
Copy link
Collaborator Author

Ok, I think I have fixed all of the concerns from this review.

@bosilca
Copy link

bosilca commented Mar 7, 2018

Looking good. go for it.

@thananon
Copy link
Collaborator

thananon commented Mar 7, 2018

👍

dong0321 pushed a commit that referenced this pull request Sep 26, 2018
Sync to OMPI master and provide updates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants