Skip to content

Stream Characteristics

Dan Debrunner edited this page Jun 15, 2015 · 1 revision

Stream Characteristics Ideas

Allow a TStream to have characteristics that can drive optimizations, like fusing and threading.

Potential characteristics:

  • Ordered - Arrival order of tuples must be maintained.
  • Partitioned - Tuples have a partitioning key, e.g. telco subscriber identifier, see Keyable.
  • Partition Ordered - Arrival order must maintained within a partition, but not required across partitions.
  • Parallel - Stream is currently parallelized, which implies is is not Ordered.
  • Low latency - Minimum processing overhead should be introduced.
  • ...

For example, if not ordered than any optimizations that do not preserve order are allowed.

In general operations such as filter or transform would result in a TStream that has all the characteristics of the input TStream. Specific methods would result in a TStream with a different set of characteristics, e.g. TStream.parallel() would result in a TStream not having the Ordered characteristic, regardless of the Ordered characteristic of the input.