Skip to content

CUDA codegen: clarify role of MAX_CONST_SIZE (elements vs. bytes) and guard only when dim unknown #253

@mattbuergler

Description

@mattbuergler

While upgrading, I noticed a small inconsistency in the CUDA generator regarding MAX_CONST_SIZE. It seems to be used as an array length in elements when a constant’s dimension isn’t known at code-gen time, but also compared against bytes in the runtime guard (dim * sizeof(T) > MAX_CONST_SIZE). This mix leads to false errors for perfectly valid constants.

What I’m seeing in a generated CUDA file (simplified):

//global constants
#ifndef MAX_CONST_SIZE
#define MAX_CONST_SIZE 128
#endif

__constant__ int G_INT_cuda[191];
__constant__ double G_REAL_cuda[191];

void op_decl_const_G_INT(int dim, char const *type,
                       int *dat){
  if (!OP_hybrid_gpu) return;
  if (dim*sizeof(int)>MAX_CONST_SIZE) {
    printf("error: MAX_CONST_SIZE not big enough\n"); exit(1);
  }
  cutilSafeCall(cudaMemcpyToSymbol(G_INT_cuda, dat, dim*sizeof(int)));
}

void op_decl_const_G_REAL(int dim, char const *type,
                       double *dat){
  if (!OP_hybrid_gpu) return;
  if (dim*sizeof(double)>MAX_CONST_SIZE) {
    printf("error: MAX_CONST_SIZE not big enough\n"); exit(1);
  }
  cutilSafeCall(cudaMemcpyToSymbol(G_REAL_cuda, dat, dim*sizeof(double)));
}

For context, older OP2 version used MAX_CONST_SIZE primarily to size __constant__ arrays when the dimension was unknown, and the guard compared size (bytes per element) to the macro, which also did not make much sense based on my understanding. So MAX_CONST_SIZE historically behaved like an element cap, not a byte cap.

Why I think this is a bug

  • MAX_CONST_SIZE is serving two meanings: element count and byte limit.
  • When the dimension is known at code-gen time (e.g., …[191]), a capacity guard against MAX_CONST_SIZE isn’t needed (and currently causes false failures in our case).
  • When the dimension is unknown, a guard makes sense, but it should compare elements to elements (dim > MAX_CONST_SIZE).

Questions / request for guidance

  • What is the intended semantics of MAX_CONST_SIZE in current OP2 CUDA codegen — elements (array extent when dimension is unknown) or bytes (a safety cap)?
  • For the unknown-dimension case, should the guard compare elements to elements (dim > MAX_CONST_SIZE), or is the bytes comparison deliberate?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions