co_manager shortcuting the scheduler #566

BrieucNicolas · 2023-08-09T15:57:14Z

Continuation of #509 adding a shortcut when discovering tasks in the dtd. The co-manager executes ready tasks discovered during completion immediatly instead of scheduling them

fix for multiple gpus Added debug output and some documentation, error tolerance for multiple gpu and no manager, refactored loop in insert_function, got rid of deadlock

devreal

I am not familiar with #509 so I might be missing something in my review.

devreal · 2023-08-12T05:48:10Z

parsec/interfaces/dtd/insert_function.c

+                    if (PARSEC_DTD_FLUSH_TC_ID == current_task->task_class->task_class_id)
+                    {
+                        PARSEC_DEBUG_VERBOSE(10, parsec_gpu_output_stream,"GPU[%s]: Thread %d scheduling task %s at %s:%d",
+                                            ((parsec_device_module_t*)*co_manager_tls_val)->name, es->th_id,


These casts are not necessary (here and elsewhere)

parsec/interfaces/dtd/insert_function.c

devreal · 2023-08-12T06:01:53Z

parsec/mca/device/cuda/device_cuda_module.c

     *   - rc > 0: there is a manager, and at the exit of the while, this thread has
     *             committed new work that the manager will need to do, but the work is
     *             not in the queue yet.
     */
    while(1) {
-        rc = gpu_device->mutex;
+        rc = rc1 = gpu_device->mutex;


I don't like rc1 for its lack of information. I suggest using a boolean variable like was_first_co_manager (you get the point :)) that is set to true if the CAS below succeeds and rc == 1.

devreal · 2023-08-12T06:08:19Z

parsec/mca/device/cuda/device_cuda_module.c

+            PARSEC_DEBUG_VERBOSE(4, parsec_gpu_output_stream,"GPU[%s]: gpu_task %p completed by co-manager %d at %s:%d", gpu_device->super.name,
+                                 gpu_task, es->th_id, __FILE__, __LINE__);
+            parsec_atomic_fetch_dec_int32( &(gpu_device->complete_mutex) );
+            parsec_list_push_back(gpu_tasks_to_free, (parsec_list_item_t*)gpu_task);


Why do we have to accumulate task objects here? Can they not be free'd directly?

Yes, thank you for pointing that out. i found a fix for this, it will be applied

BrieucNicolas requested a review from a team as a code owner August 9, 2023 15:57

BrieucNicolas force-pushed the co_manager_shortcut branch from 1eaf883 to b7a0309 Compare August 9, 2023 22:35

co_manager shortcuting the scheduler

b535ab2

fix for multiple gpus Added debug output and some documentation, error tolerance for multiple gpu and no manager, refactored loop in insert_function, got rid of deadlock

BrieucNicolas force-pushed the co_manager_shortcut branch from b7a0309 to b535ab2 Compare August 11, 2023 13:04

devreal requested changes Aug 12, 2023

View reviewed changes

BrieucNicolas marked this pull request as draft August 14, 2023 18:39

josephjohnjj mentioned this pull request Nov 1, 2024

Delegate GPU task completion to a co-manager #509

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

co_manager shortcuting the scheduler #566

co_manager shortcuting the scheduler #566

BrieucNicolas commented Aug 9, 2023

devreal left a comment

devreal Aug 12, 2023

devreal Aug 12, 2023

devreal Aug 12, 2023

BrieucNicolas Aug 14, 2023

co_manager shortcuting the scheduler #566

Are you sure you want to change the base?

co_manager shortcuting the scheduler #566

Conversation

BrieucNicolas commented Aug 9, 2023

devreal left a comment

Choose a reason for hiding this comment

devreal Aug 12, 2023

Choose a reason for hiding this comment

devreal Aug 12, 2023

Choose a reason for hiding this comment

devreal Aug 12, 2023

Choose a reason for hiding this comment

BrieucNicolas Aug 14, 2023

Choose a reason for hiding this comment