Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-sized message for barrier triggers CODES bug? #43

Open
ptaffet opened this issue Apr 28, 2021 · 0 comments
Open

Zero-sized message for barrier triggers CODES bug? #43

ptaffet opened this issue Apr 28, 2021 · 0 comments

Comments

@ptaffet
Copy link
Contributor

ptaffet commented Apr 28, 2021

TraceR seems to implement MPI_Barrier as a zero-byte allreduce (https://github.com/hpcgroup/TraceR/blob/develop/tracer/reader/otf2_reader.C#L583 ), which seems like a reasonable implementation. However, at least the fat tree model of CODES, doesn't handle zero byte messages very well.

For example, consider this snippet from https://github.com/codes-org/codes/blob/master/src/networks/model-net/fattree.c#L1811

  if((cur_entry->msg.packet_size % s->params->chunk_size) && (cur_entry->msg.chunk_id == num_chunks - 1)) {
    ts += s->params->head_delay * (cur_entry->msg.packet_size % s->params->chunk_size);
  } else {
    bf->c12 = 1;
    ts += s->params->head_delay * s->params->chunk_size;
  }

If packet_size==0, then the first mod expression evaluates to zero, i.e. false, so a message of zero bytes is treated like a message of chunk_size bytes. This is not so bad, but it means that sending e.g. a 10 byte message is substantially faster than sending a 0 byte message, which is counterintuitive and probably not intended.

I think the easiest way to fix this is to change the line in otf2_reader to implement MPI_Barrier as a small message, maybe 128 bytes. I don't have a good sense for what is realistic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant