Skip to content

Conversation

@Maxwell-Rosen
Copy link
Collaborator

This is a continuation of #864, where we added skip_cell to the new collisionless updaters. This PR aims to address #878 most of all, where Mana commented that it would be better designed if we determined the mask in one place, then applied it to all the updaters.

This means creating a new updater, called zero/skip_cell.c, which determines the mask.

Array.c is modified to support boolean arrays, reducing memory usage for this project.

Modifications are made to pass this new skip_cell struct to each relevant place.

The input file is modified, so now we specify inside each species.

.skip_cell = {
  .threshold = 1e-16
}

Each species has its own skip-cell updater, which is updated at every RK stage during the Euler step.

Future work may be interested in extending this mask to the moment calculators; however, I'm not sure how to do so within the current infrastructure. For instance, there are many calls to the moment updaters, and I don't know how each one should be changed.

Regression test results showing that the skip_cell works fine.

pgkyl gk_wham_1x2v_p1-elc_cflrate_10.gkyl sel --z2 1.0 pl
image

pgkyl gk_wham_1x2v_p1-elc_10.gkyl interp sel --z2 1.0 pl --logz --zmin 1e-18
image

pgkyl gk_wham_1x2v_p1-ion_cflrate_10.gkyl sel --z2 1.0 pl
image

pgkyl gk_wham_1x2v_p1-ion_10.gkyl interp sel --z2 1.0 pl --logz --zmin 1e-18
image

- Introduced skip_cell_threshold to gkyl_gk_collisionless_flux_new and related functions to improve handling of small values in flux calculations.
- Updated relevant structures and CUDA implementations to accommodate the new parameter.
…he skip_cell_threshold to the positivity updater. Change the minimum value of the threshold to -DBL_MAX so that it really never gets tripped
…t pass skip_cell_threshold correctly into the positivity_shift GPU kernel.
…ted to actually skip the shift, not just set the shifted_node value
…involved touching a lot of files and was a lot of refactoring. Now, there is a gkyl_skip_cell updater, which is in gyrokinetic/zero. This is where the threshold is set and determined. All the apps acquire this object and free (valgrind clean) and read the array, always making sure to do const bool *foo = gkyl_array_cfetch(). There was some confusion with pointers and booleans, but now the code works and reproduces the same results as the PR that is open right now. I will play with masking diagnostics tomorrow. The code compiles without errors and warnings. No testing has been done on GPU yet, but GPU code has been modified
…lation and not only is it very messy, but it doesn't have a dramatic impact. I tried just hardcoding the mask in to save on the code integration work and the small test showed lackluster results
@Maxwell-Rosen Maxwell-Rosen added bug Something isn't working enhancement New feature or request labels Nov 7, 2025
@Maxwell-Rosen Maxwell-Rosen changed the title Updates to skipping phase space cell updates Refactor of skipping phase space cell updates Nov 7, 2025
@manauref
Copy link
Collaborator

manauref commented Nov 7, 2025

Can you add a type to the input file table please? like

.skip_cell = {
  .type = GKYL_GK_SKIP_CELL_F_THRESHOLD,
  .threshold = 1e-16
}

as in the future we may have some other criteria and may implement other types that depend on, say, the Hamiltonian (as we did in force softening), boundaries, etc. Also, important, the 0th entry in said enum should be GKYL_GK_SKIP_CELL_NONE, and the logic in the app should be such that if type==0 then nothing happens with regard to this operator, so that if the input file doesn't specify this table, then the rest of the solver proceeds with little knowledge and no impact of this operator. That's how every other module is implemented by now.

…pe of cell skiping operation we are doing. Remove the moment_mask option because it is not implemented yet.
@Maxwell-Rosen
Copy link
Collaborator Author

Great suggestion! That makes the evaluation of the threshold more memory safe.
I should also mention that rt_gk_wham_1x2v_p1 is valgrind clean

@JunoRavin
Copy link
Collaborator

Your mask convention is still the opposite of what we're doing for fluids. You're setting the mask value to be 1 if you are inside the mask, but elsewhere in the code we say the mask value is -1 if we are inside the mask.

Please swap the convention to be consistent with moments. -1 means you are inside the mask, 1 means you are outside the mask.

and please use the convention @ammarhakim outlined that the function should be update_cell not mask_eval so that you can use true values everywhere instead of having to check !mask_eval.

@Maxwell-Rosen
Copy link
Collaborator Author

Maxwell-Rosen commented Nov 26, 2025

I think this is semantics @JunoRavin. The mask evaluation just answers the question of if the mask is true here. What true means depends on if you specify a less than threshold or a greater than threshold. A boolean true means that this condition is true. We really should do 1=true rather than -1=true. Doing -1=true is so backwards.

I think _eval makes a lot of sense here instead of update_cell, as suggested by @ammarhakim. The update_cell name is very specific to the skipping phase space cell updates application, but this aims to be a general method to evaluate a conditional statement of less than or greater than. I think this conversion of the double to boolean is essentially an evaluation of the array, which is why I called this an eval method, which is a term used in other updaters for a similar purpose.

@JunoRavin
Copy link
Collaborator

This is not a boolean mask. It is a mask for determining whether you are inside or outside a region, with the ability to define regions for updating cells or other conditions. At minimum, I need you to switch the convention to match moments. -1 means you are inside the region. 1 means you are outside the region. In this case, 1 being outside the region means we are updating cells and not skipping the updates.

…ells that are not updated. This means the skip_cell mask should be a greater than mask, which means we should flip all the conditionals inside the equation updaters. To improve readability, I changed skip_cell to update_cell, so that the if statements read something like "if update cell is false, then skip". I think it's more readable now and follows the conventions in the code better
@Maxwell-Rosen
Copy link
Collaborator Author

Okay I realized the issue is not that this mask should return -1 as true, but the skip_cell mask should have 1 for continuing the update. In other words, we should rename this to update_cell and we should use a greater than conditional mask for that update. A cell which gets an update has a true value, in other words f is greater than the threshold is true. The rename to update_cell makes more sense so that we express something in code like "if update_cell then update the cell". The logic was backwards here, but it all agrees with the correct conventions in the code.

wham_1x2v is valgrind clean and reproduces the results in the initial message. There are no issues on the GPU or on CPU, as both reproduce these plots.

Jimmy and I discussed that the skip_cell struct could be used in conjuction with the orbit-average-phase mask to completely skip phase space cell updates in the masked cells, which would save a lot of compute power. Right now, we are computing the full equations, then setting it all to zero, but if we use the skip_cell component of these updaters, then we can get some more performance. That project is left to future work and exceeds the scope of refactoring the phase space cell updates

…do not make mistakes and accessing it's private elements directly. We should use functions to call const pointers to these elements if neccisasy. I updated the ctest_dg_array_mask to reflect this. The unit test was also doing some incorrect accessing of gkyl_array->data and size, which I had to fix. I condensed the unit test by using a use_gpu flag in the function call, so the filesize went down by quite a bit.

Furthermore, the calls to the mask were utilizing the internal range object, which was made private. Instead, I made an internal gkyl_dg_array_mask_eval_idx which takes an int idx so that we have one method taking the linear long idx and another that takes the int idx
…headers also need to "acquire" the device pointer, which is now moved into the private header, so we need a public method to access this pointer. I also accidentally deleted a few indices inside the priv headers, which I brought back from main
…get any GPUs on stellar because I have too many things queued up. I'm struggling with the GPU implementation after moving the struct definition into the private header. I'm not sure where to put the GKYL_CU_DH static inline because I can't put it in the public header since it needs access to private elements
…evice or host is to have the GKYL_CU_DH flag inside the function in the private header, which we call a kernel. We have a host function in the .c file which calls this kernel. The GPU parts of the code can include this private header since those are private parts of the code and call the kernel directly
@Maxwell-Rosen
Copy link
Collaborator Author

My latest changes are improvements to this PR that I realized recently.

The unit test accessed private members of the gkeyll_array and set the data[] vector directly. This is totally wrong, and we should not do this. I improved the unit test to use a use_gpu flag, reducing its length by half.

I realized the source of this mistake was that the gkyl_array_mask structure was public; making it private prevents this type of issue. Moving the struct to be more private required creating a few new functions to access it, but the returned values are usually const. There is an exception for how the devices acquire the pointer to the dev_ptr attribute. This is consistent with how this is done elsewhere in the code, where a pointer to the device struct is passed without an acquire method.

There was some monkey business with the eval function; it needs a separate definition on both the device and the host. Since this utilizes the struct, it cannot be defined in the public header, which does not know about the array_mask struct. I solved this by creating a public host function that calls a device function in the kernel. The cu.cu, and _priv.h files can call this private kernel directly without introducing layering.

The figures from the OP are reproduced. The unit test is valgrind clean. The regression test are compute-sanitizer clean with & without -g on perlmutter.

…the simulation, then uses a fraction of the maximum to mask cells. Another applies this maximum threshold in a spatially dependent manner, so inside each configuration space cell, a reduction is performed across the velocity range for the maximum C0, which is used for that configuration space cell. CPU unit tests work on my laptop, so I'm switching to another machine for GPU testing
…timesteps because the default for update_cell was false, meaning no updates happened, and every cell skipped its updates. In GK, the default for a NONE mask should be true, but we should be able to set the default depending on the context. Non mask-setting regression tests behave normally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants