Foundation work for one-step debugger in Workflows #761

PawelPeczek-Roboflow · 2024-10-30T14:40:26Z

Description

In this PR we are introducing the following capabilities into EE:

decoupling data serialisation and deserialisation from core of EE in favour of kind-based extensions
ability to define batch-oriented inputs for any kind and any input dimensionality

Which effectively would allow us to enable debugger experience for Workflows ecosystem

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

bunch of new tests in CI for all changes introduced
old CI still 🟢

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

Docs updated? What were the changes:

…date inputs

…meters

…system

hansent · 2024-11-05T17:39:09Z

docs/workflows/create_workflow_block.md

    from typing import Literal, Union
    from pydantic import Field
    from inference.core.workflows.prototypes.block import (
        WorkflowBlockManifest,
    )
    from inference.core.workflows.execution_engine.entities.types import (
-        StepOutputImageSelector,
-        WorkflowImageSelector,
+        BatchSelector,


We are kind of reintroducing concept of batch which we tried to kind of get rid of / make implicit.

I think I get why this is correct, but have gut feeling about it being hard to understand. I think its correct in that it's more explicit (some selectors can point to batch ordered values vs some that are never, i.e. scalars in second case)

Where I think it makes things harder to understand is that for very common use case of building a block that handles image data, user has to be explicitly aware of Batch concept. Maybe thats OK / better.

Only alternative I can think of at the moment is having e.g. ImageSelector which would essentially be BatchSelector(kind=[IMAGE_KIND]) as a type. That seems a bit more accessible in terms of discovering / learning about and then maybe offers a path to learn about how Batch works (i.e. to explain we can be like "you know how you use ImageSelector...well its really Batch Selector with kind image because e.g consider the dynamic crop block output"). But I don't know maybe its just more confusing to have another type that only fixes the kind of Batch Selector?

let's have a call today - I do not really know what would be better - still thinking about it

My main concern is between ScalarSelector(...) and BatchSelector(...) as

this piece of manifest:

field_a: BatchSelector(...) field_b: ScalarSelector(...)

at the level of run method would have two variants:

when step accepts batches - def run(self, field_a: Batch[X], field_b: Y):

when step does not accept batches - def run(self, field_a: X, field_b: Y):

ok, I had a thinking session about the problem

if not batches we would only have a need to have a DataSelector(...) type annotation - and that would be our universal way of saying - "this field refers to either other step output or workflow input and expect specific data" - that would be awesome, as then you always create run(...) method as if that were any python function

... so basically what we need to overcome is the way of saying - "dear Execution Engine, as a performant block, I am able to apply the operation on multiple instances of data at a time - please provide me batched input" in a way that block always know which input parameters will be batches, which not.
The solution for that problem could be:

class Manifest(WorkflowBlockManifest): # instead @classmethod def accepts_batch_input(cls) -> bool: return True # use this @classmethod def get_batch_oriented_inputs() -> Optional[List[str]]: return ["image", "predictions"]

which would virtually inform EE when batch-oriented inputs are needed, but will not pollute 95% of the blocks with distinguishing between BatchSelector(...) or ScalarSelector(...)

as a result, EE would have the exact same information as now, but blocks would not need differentiate

unfortunately there are also disadvantages:

on UI end, we loose the ability of suggesting batch-oriented data vs scalars as this information is not embedded at the level of fields - we need to evaluate how important that would be - I can image that not particularly important for blocks that do not accept batched input, but for the other group - I still did not discover whole spectrum of side-effects - seems like EE could dynamically broadcast non-batch oriented inputs to align with blocks expectations and the only problematic situation would be when dimensions of input batches do not match - which is already handled by dimensionality guards in EE

I am not sure how to introduce the change such that we do not break the ecosystem / cause inconsistencies - I have a feeling that this is the type of change we should apply in execution engine v2

hansent · 2024-11-05T17:43:08Z

docs/workflows/blocks_bundling.md

@@ -108,6 +108,65 @@ REGISTERED_INITIALIZERS = {
 }
 ```

+## Serializers and deserializers for *Kinds*


This feels like a really powerful and extensible pattern. Wondering if it also opens the door for e.g. kind conversion/casting via serialize as one kind and deserialize as other

yes as an opportunity to hook up this extension
no if you ask if this is possible without any changes to the PR
and probably no - if the question is broader - the polymorphism in kinds type system - that would need to be approached from different angle

…ectors

PawelPeczek-Roboflow added 7 commits October 30, 2024 10:47

WIP - first changes to add serializers and deserializers

9661b48

Create first scratch of implementation for serializers and deserializers

38fc7e5

WIP - added handling for arbitrary dimensions in inputs

209866e

Finish basic testing of the new feature

297177f

Fix batch vs non-batch oriented parameters

d31a3d2

Add additional tests

ff227d5

Add remaining tests

3379902

PawelPeczek-Roboflow marked this pull request as ready for review October 31, 2024 16:07

PawelPeczek-Roboflow requested review from grzegorz-roboflow, yeldarby, probicheaux and hansent as code owners October 31, 2024 16:07

Apply refactor to add BatchOfDataSelector

329b2cc

PawelPeczek-Roboflow requested a review from capjamesg as a code owner October 31, 2024 19:29

PawelPeczek-Roboflow added 10 commits November 1, 2024 17:07

WIP - add more tests

0a6fff4

Resolve conflicts with main

e150394

Add deserialization for more kinds to ensure ability to properly vali…

2c5a13a

…date inputs

Merge branch 'main' into feature/add_support_for_all_kinds_inputs

598dff2

Add tests for deserialization

7bdcce8

Add tests for filtering of workflow results

34773e3

Add documentation - part 1

f58a6d1

Add docs - part 2

aba0aaf

Adjust docs to changes

51d5e6f

Add extension to inference_sdk to handle nested batches of input para…

272e7ef

…meters

PawelPeczek-Roboflow added the release 0.26.0 label Nov 5, 2024

PawelPeczek-Roboflow added 5 commits November 5, 2024 13:15

Add changes to align batches and scalars regarding their place in eco…

52ddb9f

…system

Start using scalar selector everywhere

b02502d

Start using BatchSelector for input images everywhere

533dd4f

Update docs and add more tests

5b5ec6c

Fix block assembler

08f2027

Merge branch 'main' into feature/add_support_for_all_kinds_inputs

9f31fe3

hansent reviewed Nov 5, 2024

View reviewed changes

PawelPeczek-Roboflow and others added 2 commits November 7, 2024 10:31

Merge branch 'main' into feature/add_support_for_all_kinds_inputs

9d11498

Refactor the PR to use Selector(...) type annotation for manifest sel…

e14a960

…ectors

PawelPeczek-Roboflow added release 0.27.0 and removed release 0.26.0 labels Nov 7, 2024

PawelPeczek-Roboflow added 7 commits November 8, 2024 10:29

WIP

c2c08c1

WIP

75ce9a4

Add abstraction to mark mixed inputs

0d859a9

Fix docs

9a35585

Adjust docs to changes

a17c656

Resolve conflicts with main

5cf9bf9

Make linters happy

2dff5de

PawelPeczek-Roboflow requested a review from hansent November 11, 2024 07:22

PawelPeczek-Roboflow and others added 4 commits November 11, 2024 09:04

Fix bug with Florence block and align blocks expected EE version

89a0aef

Fix typo in docs

6f3ca50

Fix issue with docs generation

a42fb71

Merge branch 'main' into feature/add_support_for_all_kinds_inputs

c9a3753

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Foundation work for one-step debugger in Workflows #761

Foundation work for one-step debugger in Workflows #761

PawelPeczek-Roboflow commented Oct 30, 2024 •

edited

Loading

hansent Nov 5, 2024

PawelPeczek-Roboflow Nov 6, 2024

PawelPeczek-Roboflow Nov 7, 2024

hansent Nov 5, 2024

PawelPeczek-Roboflow Nov 6, 2024

Foundation work for one-step debugger in Workflows #761

Are you sure you want to change the base?

Foundation work for one-step debugger in Workflows #761

Conversation

PawelPeczek-Roboflow commented Oct 30, 2024 • edited Loading

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

hansent Nov 5, 2024

Choose a reason for hiding this comment

PawelPeczek-Roboflow Nov 6, 2024

Choose a reason for hiding this comment

PawelPeczek-Roboflow Nov 7, 2024

Choose a reason for hiding this comment

hansent Nov 5, 2024

Choose a reason for hiding this comment

PawelPeczek-Roboflow Nov 6, 2024

Choose a reason for hiding this comment

PawelPeczek-Roboflow commented Oct 30, 2024 •

edited

Loading