Runtime node autoscaling/balancing #90

emil14 · 2022-09-28T08:07:01Z

emil14
Sep 28, 2022
Maintainer

Problem

After finding the big problem and "fixing" it with the new queue-based algorithm (see "describe connector algorithm") there's still one issue that remain unsolved - the slow consumer problem.

Any time s sender sends m message to rr receivers algorithm takes care of distribution in a way that any available receiver always receives a message as soon as possible. Yet there could be a situation when s sends new message m2 and one or more of the rr receivers are still processing the old one.

Common sense tells us there are receivers available for the new message. What can we do? We can introduce buffers. First of all this is already done via buffered channels. Second - buffers size must be restricted - otherwise we will face out of memory. That's why buffered channels are great - otherwise we would have to implement them by ourselves via slices or something. One of possible choices is spawning new goroutine on every message and controlling their count via semaphore - that works exactly like a buffer.

Conclusion is - we can't avoid blocking on buffer overflow.

Upd: we not only need to spawn and connect new runtime functions but also somehow balance load between them!

Solution

The solution is scaling. Just like Kubernetes can spawn node podes via hpa it's possible, in theory, to spawn new nodes at fbp runtime. There are several important things to node.

Note - Runtime reflection

The connection schema is program itself. Modifying it means modifying the program. This is not a problem and actually sounds crazy cool.

Problem #1 - Absence of "node" abstraction at runtime

This is a big-big problem. Runtime doesn't have concept of "node". It only have "ports", "connections" and "effects". Effects are close to nodes but they're not the same thing. It could be impossible to implement scaling without introducing concept of node at runtime.

UPD: it's not clear how we really need nodes abstraction at runtme

Problem #1.1 - Operators problem

The actual computation happens in operators (maybe we should spread this to "effects"). It's not clear if there's even a sense in spawning new non-operator nodes. It could actually be even worse - just adding new inports and outports, simply increasing network's size (but maybe new operators instances will be worth it).

UPD: Related to #212

Problem #2 - Rerouting

It's not a big problem like a previous one but it's a big amount of work to be done. Spawning new node means connecting it to the network in a way that new node can receive part of the traffic of the old one

emil14 · 2022-09-28T08:13:54Z

emil14
Sep 28, 2022
Maintainer Author

Though #1 Don't forget to spawn new ports

Spawning new node must always be accompanied with creating new channels (ports) exactly for that new node. Otherwise, if we will read "someone else's" input we can broke it's mechanics. Suppose there's a node that transforms sub-stream to a list, we have 2 instances - first one and scaled one. First one can receive the open bracket and second closed bracket.

0 replies

emil14 · 2022-11-08T05:13:17Z

emil14
Nov 8, 2022
Maintainer Author

Though #2 Compile at runtime

Since runtime doesn't know anything about nodes - we have 2 options

Modify it's structure so it will know
Somehow use compiler at runtime to generate new programs/program parts and rerun/hot replace modules

UPD: you were basically talking about JIT

0 replies

emil14 · 2024-01-15T19:11:25Z

emil14
Jan 15, 2024
Maintainer Author

The problem was kinda fixed in #430 but maybe we can think about node balancing, sounds fun

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime node autoscaling/balancing #90

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Runtime node autoscaling/balancing #90

emil14 Sep 28, 2022 Maintainer

Problem

Solution

Note - Runtime reflection

Problem #1 - Absence of "node" abstraction at runtime

Problem #1.1 - Operators problem

Problem #2 - Rerouting

Replies: 3 comments

emil14 Sep 28, 2022 Maintainer Author

emil14 Nov 8, 2022 Maintainer Author

emil14 Jan 15, 2024 Maintainer Author

emil14
Sep 28, 2022
Maintainer

emil14
Sep 28, 2022
Maintainer Author

emil14
Nov 8, 2022
Maintainer Author

emil14
Jan 15, 2024
Maintainer Author