-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shmem_malloc Interface to Leverage Hierarchical & Heterogenous Memory Characteristics #258
Comments
How is this different from #195? It looks like we are trying to address the same issue. |
Naveen, I knew you would ask this. :) The main difference is that there is only one symmetric heap here (no change to symmetric address model) and complexity of memory management is handled by the library. It is a small change (adding only one interface) and we can get most of the benefits. The approach in #195 is more explicit and here it is more implicit. That said, I feel there is value in having both solutions and they can co-exist. |
So this proposal is for something like this? void* shmem_malloc_hint(size_t size, int hint); Where |
@jamesaross Correct. |
I see now that you already added a pull request #259 but I'll continue with the discussion here. Are these types, as identified in #259, sufficient for all use cases: Why put this significant burden on the OpenSHMEM implementer when it's the application developer that has a specific allocator and/or physical memory location in mind? What do you think about alternative interfaces like these? void* shmem_malloc_ptr(size_t size, void* (*ptr_malloc)(size_t));
void shmem_free_ptr(void* ptr, void (*ptr_free)(void*)); Application developers should just say what they want. |
@jamesaross If I understand correctly - you are expecting the users to create the HEAP and pass the address to the OpenSHMEM implementation. If so, this looks more like MPI windows - In @manjugv proposal, these are hints for the libraries. It is not mandatory for all implementations to provide support for all memory types. |
@naveen-rn How does the current proposal get around the OpenSHMEM implementation creating a device heap and maintaining device addresses for every conceivable device? Also, a lookup for the default case is trivial if the implementation is clever about it. The address returned from |
@jamesaross My understanding of this proposal is - implementations will pin/register a single big chunk of memory as SHEAP sometime (may be at I haven't thought about the possible usages on all the hints mentioned in this proposal. But, hints like SHMEM_HINT_PSYNC, SHMEM_HINT_PWORK, and SHMEM_HINT_ATOMICS can be effectively used at the |
HPC application portability is rarely defined by the small burden of replacing an allocator or swapping it with a macro. If it's expected that most implementations won't bother with supporting most hint types and the implementations that do will support a very specific subset of devices, wouldn't it be simpler to have vendor-specific special allocator extensions? Below is an example of a portable code with a vendor-specific special allocator. #include "shmemx.h"
#if SHMEMX_SPECIAL_ALLOCATOR_AVAILABLE
#define shmem_malloc_special(size) shmemx_malloc_special((size), SHMEM_HINT_IS_PSYNC)
#else
#define shmem_malloc_special(size) shmem_malloc((size))
#endif
// ...
// this is now portable:
size_t sz = log(shmem_n_pes()) + 2;
int* pSync = shmem_malloc_special(sz); |
Not sure If I understand your point entirely. With the interface in this proposal, there is a two way communication and agreement. (1) The user tells the library that a particular allocation will be used in a specific way. (2) The library uses that information and optimizes for that usage. If the user keeps up with the promise and the library can optimize, there will be performance benefits to the applications. I disagree that it is a huge burden to implement. The network libraries can already support some of these hints and there is no way to provide these benefits to the user. Also, most of these hints are easy to implement with wrappers without any need for fancy allocators. For example, one could use Memkind to support many of these hints. I’m open to trim some of these hints, if we find something terribly difficult to implement and does not provide huge benefits. Again, remember supporting hints are optional. |
Yes @jdinan. Closing it now. |
Problem:
A typical node in the current HPC systems is composed of variety of memories and organized into multiple hierarchies and/or have different affinities to the PEs and threads. The OpenSHMEM programming model and its memory allocation routines are oblivious to these variations. As a consequence, it is a challenge for the OpenSHMEM program to leverage memory characteristics and capabilities to achieve higher performance in a portable way.
Proposal:
Introduce memory allocation interface that can pass hints to the OpenSHMEM implementations.
The memory hints are then utilized by the implementations to optimize memory for that behavior.
For example, if the user specifies that a particular allocation is used as a pSync array, then the implementation can use the memory that is available on NUMA memory bank that is near to the network. In the cases where the memory is available on the network interface, it can allocate that memory for the pSync array. This can impact the latency characteristics.
Impact on Users:
These interfaces should provide an opportunity to the user to provide usage information to
the implementation, which implementations can then utilize to optimize for that behavior. If the implementations
optimize for that behavior, the programs should achieve higher performance and/or scalability.
The OpenSHMEM programs not using these interfaces or using SHMEM_HINT_NONE is not
impacted.
Impact on Implementations:
This provides an opportunity for the implementations to optimize the behavior for particular usage.
If an implementation does not support optimizations, it is allowed to default to shmem_malloc behavior.
Useful References
The text was updated successfully, but these errors were encountered: