Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ requirements:
- python >=3.9,<3.13
- typing_extensions >=4.8
- orjson >=3.9,<4
- pydantic >=2.7,<2.12
- pydantic-settings >=2.3,<2.11
- pydantic >=2.7,<2.13
- pydantic-settings >=2.3,<2.12
- jsonschema >=4.3.0
- fastavro >=1.8,<2.0
- jsonlines >=4,<5
Expand Down
88 changes: 58 additions & 30 deletions docs/api-reference/dataframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
class StreamingDataFrame()
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L90)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L94)

`StreamingDataFrame` is the main object you will use for ETL work.

Expand Down Expand Up @@ -73,7 +73,7 @@ sdf = sdf.to_topic(topic_obj)
def stream_id() -> str
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L175)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L179)

An identifier of the data stream this StreamingDataFrame
manipulates in the application.
Expand Down Expand Up @@ -107,7 +107,7 @@ def apply(func: Union[
metadata: bool = False) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L234)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L238)

Apply a function to transform the value and return a new value.

Expand Down Expand Up @@ -165,7 +165,7 @@ def update(func: Union[
metadata: bool = False) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L338)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L342)

Apply a function to mutate value in-place or to perform a side effect

Expand Down Expand Up @@ -233,7 +233,7 @@ def filter(func: Union[
metadata: bool = False) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L441)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L445)

Filter value using provided function.

Expand Down Expand Up @@ -285,7 +285,7 @@ def group_by(key: Union[str, Callable[[Any], Any]],
key_serializer: SerializerType = "json") -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L526)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L530)

"Groups" messages by re-keying them via the provided group_by operation

Expand Down Expand Up @@ -350,7 +350,7 @@ a clone with this operation added (assign to keep its effect).
def contains(keys: Union[str, list[str]]) -> StreamingSeries
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L640)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L644)

Check if keys are present in the Row value.

Expand Down Expand Up @@ -392,7 +392,7 @@ def to_topic(
key: Optional[Callable[[Any], Any]] = None) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L684)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L688)

Produce current value to a topic. You can optionally specify a new key.

Expand Down Expand Up @@ -463,7 +463,7 @@ def set_timestamp(
func: Callable[[Any, Any, int, Any], int]) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L753)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L757)

Set a new timestamp based on the current message value and its metadata.

Expand Down Expand Up @@ -516,7 +516,7 @@ def set_headers(
) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L796)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L800)

Set new message headers based on the current message value and metadata.

Expand Down Expand Up @@ -565,7 +565,7 @@ a new StreamingDataFrame instance
def print(pretty: bool = True, metadata: bool = False) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L847)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L851)

Print out the current message value (and optionally, the message metadata) to

Expand Down Expand Up @@ -628,7 +628,7 @@ def print_table(
int]] = None) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L893)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L897)

Print a table with the most recent records.

Expand Down Expand Up @@ -721,7 +721,7 @@ sdf.print_table(size=5, title="Live Records", slowdown=1)
def compose(sink: Optional[VoidExecutor] = None) -> dict[str, VoidExecutor]
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1009)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1013)

Compose all functions of this StreamingDataFrame into one big closure.

Expand Down Expand Up @@ -775,7 +775,7 @@ def test(value: Any,
topic: Optional[Topic] = None) -> List[Any]
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1043)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1047)

A shorthand to test `StreamingDataFrame` with provided value

Expand Down Expand Up @@ -811,11 +811,13 @@ def tumbling_window(
duration_ms: Union[int, timedelta],
grace_ms: Union[int, timedelta] = 0,
name: Optional[str] = None,
on_late: Optional[WindowOnLateCallback] = None
on_late: Optional[WindowOnLateCallback] = None,
before_update: Optional[WindowBeforeUpdateCallback] = None,
after_update: Optional[WindowAfterUpdateCallback] = None
) -> TumblingTimeWindowDefinition
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1082)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1086)

Create a time-based tumbling window transformation on this StreamingDataFrame.

Expand Down Expand Up @@ -885,6 +887,18 @@ sdf = (
If the callback returns `True`, the message about a late record will be logged
(default behavior).
Otherwise, no message will be logged.
- `before_update`: an optional callback to trigger early window expiration
before the window is updated.
The callback receives `aggregated` (current aggregated value or default/None),
`value`, `key`, `timestamp`, and `headers`.
If it returns `True`, the window will be expired immediately.
Default - `None`.
- `after_update`: an optional callback to trigger early window expiration
after the window is updated.
The callback receives `aggregated` (updated aggregated value), `value`, `key`,
`timestamp`, and `headers`.
If it returns `True`, the window will be expired immediately.
Default - `None`.


<br>
Expand All @@ -907,7 +921,7 @@ def tumbling_count_window(
name: Optional[str] = None) -> TumblingCountWindowDefinition
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1171)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1193)

Create a count-based tumbling window transformation on this StreamingDataFrame.

Expand Down Expand Up @@ -976,11 +990,13 @@ def hopping_window(
step_ms: Union[int, timedelta],
grace_ms: Union[int, timedelta] = 0,
name: Optional[str] = None,
on_late: Optional[WindowOnLateCallback] = None
on_late: Optional[WindowOnLateCallback] = None,
before_update: Optional[WindowBeforeUpdateCallback] = None,
after_update: Optional[WindowAfterUpdateCallback] = None
) -> HoppingTimeWindowDefinition
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1221)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1243)

Create a time-based hopping window transformation on this StreamingDataFrame.

Expand Down Expand Up @@ -1060,6 +1076,18 @@ sdf = (
If the callback returns `True`, the message about a late record will be logged
(default behavior).
Otherwise, no message will be logged.
- `before_update`: an optional callback to trigger early window expiration
before the window is updated.
The callback receives `aggregated` (current aggregated value or default/None),
`value`, `key`, `timestamp`, and `headers`.
If it returns `True`, the window will be expired immediately.
Default - `None`.
- `after_update`: an optional callback to trigger early window expiration
after the window is updated.
The callback receives `aggregated` (updated aggregated value), `value`, `key`,
`timestamp`, and `headers`.
If it returns `True`, the window will be expired immediately.
Default - `None`.


<br>
Expand All @@ -1083,7 +1111,7 @@ def hopping_count_window(
name: Optional[str] = None) -> HoppingCountWindowDefinition
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1324)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1364)

Create a count-based hopping window transformation on this StreamingDataFrame.

Expand Down Expand Up @@ -1161,7 +1189,7 @@ def sliding_window(
) -> SlidingTimeWindowDefinition
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1381)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1421)

Create a time-based sliding window transformation on this StreamingDataFrame.

Expand Down Expand Up @@ -1259,7 +1287,7 @@ def sliding_count_window(
name: Optional[str] = None) -> SlidingCountWindowDefinition
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1476)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1516)

Create a count-based sliding window transformation on this StreamingDataFrame.

Expand Down Expand Up @@ -1329,7 +1357,7 @@ sdf = (
def fill(*columns: str, **mapping: Any) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1529)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1569)

Fill missing values in the message value with a constant value.

Expand Down Expand Up @@ -1386,7 +1414,7 @@ def drop(columns: Union[str, List[str]],
errors: Literal["ignore", "raise"] = "raise") -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1581)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1621)

Drop column(s) from the message value (value must support `del`, like a dict).

Expand Down Expand Up @@ -1430,7 +1458,7 @@ a new StreamingDataFrame instance
def sink(sink: BaseSink)
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1625)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1665)

Sink the processed data to the specified destination.

Expand Down Expand Up @@ -1458,7 +1486,7 @@ operations, but branches can still be generated from its originating SDF.
def concat(other: "StreamingDataFrame") -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1663)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1703)

Concatenate two StreamingDataFrames together and return a new one.

Expand Down Expand Up @@ -1499,7 +1527,7 @@ def join_asof(right: "StreamingDataFrame",
name: Optional[str] = None) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1699)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1739)

Join the left dataframe with the records of the right dataframe with

Expand Down Expand Up @@ -1582,7 +1610,7 @@ def join_interval(
forward_ms: Union[int, timedelta] = 0) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1775)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1815)

Join the left dataframe with records from the right dataframe that fall within

Expand Down Expand Up @@ -1685,7 +1713,7 @@ def join_lookup(
) -> "StreamingDataFrame"
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1880)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1920)

Note: This is an experimental feature, and its API is likely to change in the future.

Expand Down Expand Up @@ -1746,7 +1774,7 @@ sdf = sdf.join_lookup(lookup, fields)
def register_store(store_type: Optional[StoreTypes] = None) -> None
```

[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L1969)
[[VIEW SOURCE]](https://github.com/quixio/quix-streams/blob/main/quixstreams/dataframe/dataframe.py#L2009)

Register the default store for the current stream_id in StateStoreManager.

Expand Down
Loading