Skip to content

Conversation

@franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Oct 16, 2025

Factored out of #5405 as a standalone feature.

Different domain decomposition means a different distribution of parallel processes over the simulation domain. The number of parallel processes may vary in between runs.

  • Fields: Mostly support this already, just do random IO accesses into the sub-grid of interest. The PML fields do not support this, skip them (for now?).
  • Particles: So far, the restart logic expected that there would be exactly one precisely matching particle patch. The new logic now accepts any number of particle patches and filters the patches if they only overlap with the local domain.

TODO:

  • Auto-detect if PML load operations should be skipped
  • Maybe someone has a good test case for this

To reviewers, please also check the comments below

Copy link
Contributor Author

@franzpoeschel franzpoeschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some lines to pay special attention to for code reviewers

* with, we must take care not to index past the dataset boundaries. Just loop around to the start
* in that case. Not the finest way, but it does the job for now..
*/
start.push_back(gridPos.revert()[d] % extent[d]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the best solution? This loads nextId and startId. If the GPU grid does not match the old one, this logic currently just uses modulus to loop around and hand out some random ID. No idea if that has downsides.

for(size_t d = 0; d < simDim; ++d)
{
auto positionInD = positionVec[d] + positionOffsetVec[d];
if(positionInD < patchTotalOffset[d] || positionInD >= patchUpperCorner[d])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line decides if a particle from a partially overlapping domain should be considered. I treat the lower boundary as inclusive and the upper boundary as exclusive, otherwise I had single particles loaded into both processes.

DataSpace<simDim> const patchTotalOffset
= localToTotalDomainOffset + threadParams->localWindowToDomainOffset;
DataSpace<simDim> const patchExtent = threadParams->window.localDimensions.size;
DataSpace<simDim> const patchUpperCorner = patchTotalOffset + patchExtent;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah someone please check if all these position computations match up. They seem to do in my tests.

filterKeep,
filterRemove,
alpaka::getPtrNative(filter));
eventSystem::getTransactionEvent().waitForFinished();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone please verify the kernel and its call. I suspect that the call can be made more efficient.

// patchExtent)) <<
// '\n';
if((patchTotalOffset <= offsets[i]) == true_
&& ((offsets[i] + extents[i]) <= (patchTotalOffset + patchExtent)) == true_)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, someone please verify the positioning logic in here. Note that both ends are inclusive here, since ideally (i.e. in the normal case) we compare a particle patch from disk against itself, i.e. the local window which is the same.

@PrometheusPi
Copy link
Member

Regarding the open point "maybe someone has a good test case for this":
I guess the ion simulations from @pordyna or @paschk31 might qualify for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants