-
Notifications
You must be signed in to change notification settings - Fork 54
Description
While upgrading, I noticed a small inconsistency in the CUDA generator regarding MAX_CONST_SIZE. It seems to be used as an array length in elements when a constant’s dimension isn’t known at code-gen time, but also compared against bytes in the runtime guard (dim * sizeof(T) > MAX_CONST_SIZE). This mix leads to false errors for perfectly valid constants.
What I’m seeing in a generated CUDA file (simplified):
//global constants
#ifndef MAX_CONST_SIZE
#define MAX_CONST_SIZE 128
#endif
__constant__ int G_INT_cuda[191];
__constant__ double G_REAL_cuda[191];
void op_decl_const_G_INT(int dim, char const *type,
int *dat){
if (!OP_hybrid_gpu) return;
if (dim*sizeof(int)>MAX_CONST_SIZE) {
printf("error: MAX_CONST_SIZE not big enough\n"); exit(1);
}
cutilSafeCall(cudaMemcpyToSymbol(G_INT_cuda, dat, dim*sizeof(int)));
}
void op_decl_const_G_REAL(int dim, char const *type,
double *dat){
if (!OP_hybrid_gpu) return;
if (dim*sizeof(double)>MAX_CONST_SIZE) {
printf("error: MAX_CONST_SIZE not big enough\n"); exit(1);
}
cutilSafeCall(cudaMemcpyToSymbol(G_REAL_cuda, dat, dim*sizeof(double)));
}For context, older OP2 version used MAX_CONST_SIZE primarily to size __constant__ arrays when the dimension was unknown, and the guard compared size (bytes per element) to the macro, which also did not make much sense based on my understanding. So MAX_CONST_SIZE historically behaved like an element cap, not a byte cap.
Why I think this is a bug
MAX_CONST_SIZEis serving two meanings: element count and byte limit.- When the dimension is known at code-gen time (e.g.,
…[191]), a capacity guard againstMAX_CONST_SIZEisn’t needed (and currently causes false failures in our case). - When the dimension is unknown, a guard makes sense, but it should compare elements to elements (
dim > MAX_CONST_SIZE).
Questions / request for guidance
- What is the intended semantics of MAX_CONST_SIZE in current OP2 CUDA codegen — elements (array extent when dimension is unknown) or bytes (a safety cap)?
- For the unknown-dimension case, should the guard compare elements to elements (dim > MAX_CONST_SIZE), or is the bytes comparison deliberate?