Configure algorithm runtime, don't schedule data object nodes as algortihms #21

joott · 2024-07-09T14:37:48Z

BEGINRELEASENOTES

Algorithms now handled as a callable struct containing graph information
Data objects no longer scheduled as tasks

Struct allows algorithm to access information about its vertex, including average runtime and size.
Ideas taken from @m-fila in #19.

ENDRELEASENOTES

based on Mateusz's ideas without the datadeps updates

* added size_kb field to store size as a double * adjusted scheduling.jl treatment of size to align with test graphs

Changed size_kb to size_average_B

m-fila · 2024-07-09T15:37:52Z

A future of an algorithm execution is stored in the successor DataObject vertices. I like this conceptually but would it work also for graphs in which all the paths are terminated by a algorithm (consumer algorithm that only takes input but doesn't produce any output)?
I'd expect it can be a problem since we'd "lose" the future and don't have anything to wait for (or attach a notify graph finalization task to)

m-fila · 2024-07-10T12:06:49Z

I tried with data/sequential/df.graphml (terminating with consumer algorithm instead of data object) from master and got an error

How about the schedule_graph return a vector of tasks for all the terminating algorithms?
By terminating I mean algorithms that don't have a successor algorithm -> so either have no direct successors at all (no data object produced) or their direct successor have no successors (data objects produces but not used anywhere else).
This vector can be later given to notify_graph_finalization if scheduling multiple graphs or just awaited directly if scheduling a single graph

schedule_graph could check each algorithm vertex with something like

function is_terminating_alg(graph::AbstractGraph, vertex_id::Int)
    successor_dataobjects = outneighbors(graph, vertex_id)
    is_terminating(vertex) = isempty(outneighbors(graph, vertex))
    all(is_terminating, successor_dataobjects)
end

and then append Dagger.@spawned task if needed

What do you think?

joott · 2024-07-10T13:44:10Z

We could store futures in the algorithm as well as its children, which should resolve it. I do like the idea of schedule_graph returning a list of all the terminating algorithms though - that way we don't have to iterate through the whole graph again afterwards. Maybe we do both?

m-fila · 2024-07-10T13:51:18Z

We could store futures in the algorithm as well as its children, which should resolve it. I do like the idea of schedule_graph returning a list of all the terminating algorithms though - that way we don't have to iterate through the whole graph again afterwards. Maybe we do both?

Both is good

joott added 5 commits July 6, 2024 08:46

New treatment of MockupAlgorithms

e633fc1

based on Mateusz's ideas without the datadeps updates

Added size and runtime properties to test graphs

6282c72

* added size_kb field to store size as a double * adjusted scheduling.jl treatment of size to align with test graphs

Made graph property consistent with tests

5b1cfdc

Changed size_kb to size_average_B

only check for results in data nodes

6225b77

Cleaned up struct function

c6baaec

m-fila self-requested a review July 9, 2024 14:48

joott and others added 3 commits July 10, 2024 17:10

Collect terminating results in schedule_graph

cbc9678

removed obsolete unused functions

ea3a127

removed ! as schedule algorithm is not mutating

10f20ee

m-fila changed the title ~~Wrap MockupAlgorithm in a callable struct~~ Configure algorithm runtime, don't schedule data object nodes as algortihms Jul 10, 2024

m-fila merged commit 41a471b into key4hep:main Jul 10, 2024
2 checks passed

joott deleted the algorithm branch July 12, 2024 12:44

m-fila mentioned this pull request Jul 16, 2024

Fix scheduling algorithms without runtime property #26

Merged

m-fila mentioned this pull request Aug 9, 2024

Scheduling wrappers #18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure algorithm runtime, don't schedule data object nodes as algortihms #21

Configure algorithm runtime, don't schedule data object nodes as algortihms #21

joott commented Jul 9, 2024

m-fila commented Jul 9, 2024

m-fila commented Jul 10, 2024

joott commented Jul 10, 2024

m-fila commented Jul 10, 2024

Configure algorithm runtime, don't schedule data object nodes as algortihms #21

Configure algorithm runtime, don't schedule data object nodes as algortihms #21

Conversation

joott commented Jul 9, 2024

m-fila commented Jul 9, 2024

m-fila commented Jul 10, 2024

joott commented Jul 10, 2024

m-fila commented Jul 10, 2024