-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow updating application parameters for testing purposes #92
Conversation
58900a4
to
7d8e563
Compare
907c757
to
371b129
Compare
I think it would be a good idea to add assert in modified kernels that batch_size and BLOCKS_IN_GRID can be divided. |
struct __cudampi__arguments_type | ||
{ | ||
int cpu_enabled; | ||
int number_of_streams; | ||
int batch_size; | ||
int powercap; // 0 means disabled | ||
long long problem_size; | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good design choice to have both library configuration parameters (cpu_enabled
, powercap
) and application configuration parameters (number_of_streams
, batch_size
, problem_size
) in one structure as having application configuration parameters here introduces assumptions about application structure.
I believe that we could provide an option to set application parameters when initializing library for easier usage in our test apps, but it should be optional and ideally separated into another function (outside of cudampilib.c). In this structure we could keep powercap
, cpu_enabled
and optionally add number_of_cpu_streams
or gpu_memcpy_buffer
and move other parameters to separate structure that would be read with separate function (that just returns the structure). Also BLOCKS_IN_GRID could be moved there. After such change I see two possible ways to use application parameters:
- Modify kernel functions in the lib to take grid size as argument, so that grid size would be passed with each kernel call
- Keep the kernel functions as they are (before this change) and just use externally defined grid size that would be set to value from arguments (read with new function) inside user application (I'd rather go with this approach as it requires less modifications in code and leaves greater flexibility)
/* Default values */ | ||
__cudampi__arguments.cpu_enabled = 1; | ||
__cudampi__arguments.number_of_streams = 1; | ||
__cudampi__arguments.batch_size = 50000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having default values for application dependent parameters (batch_size, problem_size) defined inside library is not a good idea. I think if they are not provided we could just exit.
if (cudaSuccess != __cudampi__getCpuFreeThreads(&__cudampi__localFreeThreadCount)) { | ||
log_message(LOG_ERROR, "Error invoking __cudampi__getCpuFreeThreads()"); | ||
exit(-1); | ||
MPI_Bcast(&__cudampi__batch_size, 1, MPI_INT, 0, MPI_COMM_WORLD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't assume single batch_size would be used for all kernels. If we want to have it configurable in a way that is not transparent to library it should be passed in kernel call.
{ | ||
int cpu_enabled; | ||
int number_of_streams; | ||
int batch_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PLEASE MAKE IT UNSIGNED LONG
Merged the changes (#76) so that it would be possible to produce experiment results. |
No description provided.